Re: Ignore invalid bytes

Andy Clark 26 Jul 2004 22:20:28 -0000

Harald Wehr wrote:

Is it possible to tell xerces just to ignore these bytes and to go on parsing the document?


You really shouldn't ignore this type of error. And even though
Xerces has a continue-after-fatal-error setting, you are likely
to get caught in an infinite loop if you use it in this situation.

There is no need to display these documents 100 % correctly. A missing character is acceptable for us in this project rather than chrashing the whole document with this exception.


Depending on the primary data in your document, a cheap trick is
to use a Reader object with the input encoding set to ISO Latin 1
because it uses the full eight bits in each byte and nothing is
invalid. Of course, you should realize that every UTF-8 character
after 127 will be corrupted using this trick.

--
Andy Clark * [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Ignore invalid bytes

Reply via email to