DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=5602>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=5602 Parsing gets very slow with lots of entity references [EMAIL PROTECTED] changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Product|Xerces-J |Xerces2-J Resolution|REMIND | Version|1.4.4 |2.0.0 [beta 4] ------- Additional Comments From [EMAIL PROTECTED] 2002-01-04 08:30 ------- It looks like the performance improvement in Xerces 2.0.0B4 might actually be a bug. Most of the time in Xerces 1.4.4 (or in Xerces 2.0.0B3) is spent in appending chunks of text. With the large number of entity references, those parsers (in the non-deferred DOM case) receive a chunk of text between entity references, then a predefined entity reference, then a chunk of text between references, etc. Each of those chunks is appended onto a single TEXT_NODE. In Xerces 2.0.0B4 the AbstractDOMParser is creating an ENTITY_REFERENCE_NODE for each reference to a predefined entity and TEXT_NODEs for the chunks of text in between. This eliminates all of the overhead of appending. However, that behaviour is not permitted by the DOM API Recommendation[1]. As soon as we fix that problem, we'll be able to reproduce the performance problem in Xerces-2, so I'll update the product and version, and reopen the bug. I'm not sure whether we'll be able to do anything about it, though. I was also able to reproduce the DOMException with j_caesar.xml that you mentioned; I'll open a separate bug for that. [1]: http://www.w3.org/TR/2000/REC-DOM-Level-2-Core- 20001113/introduction.html#ID-E7C30824 --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
