I have been trying to execute a query from the command line that would
return all elements containing a yen sign. I haven't been successful
yet, and I found this bit on the Xalan site:
* Text nodes with entity references are not handled correctly by
the Document Table Model (DTM). An entity reference in a text node
causes the node to be split at that entity. A problem was also reported
with the contains() function; contains(String, entity) was returning
false when it should return true. Workaround: instantiate the
XSLTProcessor as follows so it uses the XercesLiaison class and the
Xerces DOM parser:
org.apache.xalan.xslt.XSLTProcessor xsltProc =
org.apache.xalan.xslt.XSLTProcessorFactory.getProcessor(
new org.apache.xalan.xpath.xdom.XercesLiaison());
If you are running org.apache.xalan.xslt.Process from the command
line, include
-parser org.apache.xalan.xpath.xdom.XercesLiaison
on the command line.
I was wondering if anyone had any other information about this, and
whether anyone knew if this is what is causing the problem, or if this
workaround is implemented in Xindice? Or does anyone know how to search
for text containing non-ascii characters?
