DO NOT REPLY [Bug 27844] - XML parsing skips text

bugzilla Mon, 22 Mar 2004 07:17:58 -0800

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=27844>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.


http://issues.apache.org/bugzilla/show_bug.cgi?id=27844

XML parsing skips text

[EMAIL PROTECTED] changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID



------- Additional Comments From [EMAIL PROTECTED]  2004-03-22 15:18 -------
The JavaDoc of ContentHandler#characters states:

"The Parser will call this method to report each chunk of character data. SAX 
parsers may return all contiguous character data in a single chunk, or they may 
split it into several chunks; however, all of the characters in any single 
event must come from the same external entity so that the Locator provides 
useful information."

SAX parsers are free to split character data into as any much chunks as 
they please and they can split the text at whatever boundaries they want. This 
is allowed for reasons having to do with parser efficiency and input buffering. 
In order to handle this properly, your handler needs to accumulate the text 
returned in each call until you recieve a callback that isn't characters.

Xerces will split calls to characters at the end of an internal buffer, at a 
new line and also at a few other boundaries. You can never rely on contiguous 
text to be passed in a single characters callback.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

DO NOT REPLY [Bug 27844] - XML parsing skips text

Reply via email to