Thanks for the reply. This does help.


----- Original Message -----
From: "Christopher Ebert" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, November 18, 2002 6:05 PM
Subject: RE: Parsing XML Containing Euro Sign


>
> Hi Mike,
>
> I can't tell exactly what the problem is, but I've dealt some with
character encoding issues...
>
> I ran into an 'unsupported encoding' problem because the byte-to-character
converter classes changed name. I built Xerces and printed out the exception
being thrown (a debugger that breaks on exceptions would have been nice :)
and found that the name for ISO-8859-1 changed from something like ISO8859_1
to ISO_8859_1 or something like that: the encoder I had had one name and
xerces was looking for the other. I copied the other out of a recent Sun
distribution and everything works fine now.
>
> If you can't get the automatic mechanism to work, you can always try
opening your io streams with specific character encodings -- bear in mind
that the encoding name might be a little different than the ISO name.
>
> You might check the character encoding the file really uses. I've gotten
files that do not use the encoding they say they do. There was a thread on
the list a while ago about files from Japan that use a Windows encoding
'differently' :).
>
>
> Cheers,
> Chris
>
>
> -----Original Message-----
> From: Mike Lepine [mailto:[EMAIL PROTECTED]
> Sent: Monday, November 18, 2002 8:38 AM
> To: [EMAIL PROTECTED]
> Subject: Parsing XML Containing Euro Sign
>
>
> I searhed the Xerces FAQ, tried to review the mailing list archives but
they
> appeared to be offline, and was not able to find much information on my
> question. So, if this has been asked/answered before, I apologize for
> reposting.
>
> I am using Xerces version 1.4.2 to parse an XML document containing a Euro
> sign character. I create a FileInputStream
>
>             // create input stream from XML file
>             FileInputStream inputStream(new File(fileName));
>
>             // parse XML
>             parser.parse(new InputSource(inputStream));
>
> When the XML sign containing the Euro sign is parsed, it is misread,
> converting it to a different character (in this case a question mark). I
> tried to change the document encoding to UTF-16 instead of UTF-8 but this
> generated an exception stating that UTF-16 was not supported.
>
> In order to write the XML file (containing the Euro sign), I have to make
> sure the data is written out as characters instead of bytes because when
the
> Euro sign is converted to a byte, it looks like the high order byte is
> discarded resulting in the wrong character being written out.
>
> Finally, my question is whether I can use Xerces to parse an XML document
> containing the Euro sign and if so, how do I do it?
>
> I appreciate any help offered.
>
> Thanks.
>
> - Mike
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to