Hi Shekhar, Setting the encoding on the input source allows the parser to skip encoding auto-detection, however once it reads the encoding from the XML decleration it will create a new character reader if the previous encoding (either auto-detected or supplied by the user) is different from the one specified in the document, so you don't gain anything by doing this.
You should try changing the value of the encoding in the actual document. <?xml version="1.0" encoding="iso-8859-1"?> If you absolutely cannot get alter your input document, you can try setting your own character reader on the input source. This will force the parser to use your own reader. If you have an InputStream to the document you can easily get one for ISO-8859-1 using an InputStreamReader. At 11:08 AM 31/03/2003 +0530, you wrote: > is still giving the same error. I just thought I should clarify >that I have a XML document given to me by a client that I need to parse. >The XML document has its encoding set to UTF-8 <> I need to parse this >document but the character with hex value B6 present in the XML Document >is not being accepted by the parser. I need to overrider the encoding set >in the XML document through the code but setting the Inputsource encoding >to ISO-88591-1 is not doing the trick. Thanks Shekhar ----- Original >Message ----- From: Ragunath Marudhachalam To: >[EMAIL PROTECTED] Sent: Friday, March 28, 2003 7:48 PM >Subject: RE: UTF-8 Encoding > yes. OutputFormat format = new OutputFormat( Document ); >file://Serialize DOM format.setEncoding("ISO-8859-1"); This will >set the encoding to ISO-8859-1 instead of UTF-8. UTF-8 is the default >encoding that is set when u create a document without specifying any >encoding. If you are not using serialization, then try setting the >encoding to the InputSource. Ragu >CircuitVision -----Original Message----- >From: Shekhar Karani [mailto:[EMAIL PROTECTED] >Sent: Friday, March 28, 2003 9:16 AM >To: [EMAIL PROTECTED] >Subject: Re: UTF-8 Encoding > > Doing that in my code will over ride the XML document encoding? > Shekhar ----- Original Message ----- From: >Ragunath Marudhachalam To: [EMAIL PROTECTED] Sent: >Friday, March 28, 2003 7:33 PM Subject: RE: UTF-8 Encoding > set the encoding to "ISO-8859-1" Ragu >CircuitVision -----Original Message----- >From: Shekhar Karani [mailto:[EMAIL PROTECTED] >Sent: Friday, March 28, 2003 6:27 AM >To: [EMAIL PROTECTED] >Subject: UTF-8 Encoding > > Hi > >I am using the xerces 2.2.1 to parse XML documents. One of the XML >documents has a hex character B6. This character is being treated >as an >invalid UTF-8 character by the parser. The parser gives the error >"Invalid byte 1 of UTF-8 byte stream". However, the editor XML SPY >version 5, accepts this character. > >Please let me know what I need to do in my code to accept this >character. > >The archives on the mailing list are not accessible hence I am not >sure >if this question is present there. Thanks >Shekhar > > > > > ----------------------------- Michael Glavassevich [EMAIL PROTECTED] 4B Computer Engineering University of Waterloo --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]