utf-8 characters problem

Jakub Kahovec 28 Feb 2005 20:10:01 -0000

Hi, when I parse the xml document (with xerces 2.6.2) which has in xml declaration specified utf-8 encoding and which contains utf-8 characters in character reference form &#xxxx; the parser replaces these characters with ascii characters. For some characters is ok but for instance InvisibleTimes change for some incorrect strange character sentese. I'd like to know if is possible to prohibit changing characters from char. ref. form ? Or does it exist some recommendation how to treat with these characters.

Here is a piece of my 'problematic' xml document

<?xml version="1.0" encoding="UTF-8"?>
<mathDoc>

<p>Factorise the following quadratic expression:
       <math>
         <mrow>
           <msup>
             <mrow>
           <mi>x</mi>
             </mrow>
             <mrow>
           <mn>2</mn>
             </mrow>
           </msup>
           <mo>&#x002b;</mo> <!-- replaces with character + -->
           <mi>p</mi>
           <mo>&#x2062;</mo>   <!-- here is InvisibleTimes -->
                   <mi>x</mi>
           <mo>&#x002b;</mo>  <!-- replaces with character + -->
           <mi>q</mi>
         </mrow>
       </math>

</mathDoc>

Thanks so much

Jakub

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

utf-8 characters problem

Reply via email to