DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4455>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4455 parser problem! ------- Additional Comments From [EMAIL PROTECTED] 2001-12-13 12:51 ------- >1. Entity substitution: according to my understanding of the xml spec, > "&" is not an entity reference. The DOM WG had to check this. The official answer was that & _is_ an entity reference, not a numeric character reference... but that since parsers are premitted to "flatten" (fully expand) entity references before presenting the document to the user, it's entirely reasonable for it to flatten this one even if it doesn't flatten user-defined entities. >2. The string is splitted into 3 strings: xml parsers are free to group >characters in chunks. In SAX, this is definitely true. SAX may break text into multiple characters() calls for many reasons, and SAX applications have to be written so they can deal with that. Standard solution if you need a single string is to accumulate incoming characters() data until the first non-characters() call, and process the collected data at that time. In the DOM, the anwer is somewhat different. The DOM spec says that the initial state of a DOM as delivered by an "XML processor" (by which the XML spec means "parser") should be as if the normalize() operation had been called -- in other words, any adjacent non-CDATASection text should be coalesced into a single Text node. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
