DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://issues.apache.org/bugzilla/show_bug.cgi?id=27844>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=27844 XML parsing skips text [EMAIL PROTECTED] changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |INVALID ------- Additional Comments From [EMAIL PROTECTED] 2004-03-22 15:18 ------- The JavaDoc of ContentHandler#characters states: "The Parser will call this method to report each chunk of character data. SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity so that the Locator provides useful information." SAX parsers are free to split character data into as any much chunks as they please and they can split the text at whatever boundaries they want. This is allowed for reasons having to do with parser efficiency and input buffering. In order to handle this properly, your handler needs to accumulate the text returned in each call until you recieve a callback that isn't characters. Xerces will split calls to characters at the end of an internal buffer, at a new line and also at a few other boundaries. You can never rely on contiguous text to be passed in a single characters callback. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
