DO NOT REPLY [Bug 5602] - Parsing gets very slow with lots of entity references

bugzilla Fri, 04 Jan 2002 08:27:51 -0800

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=5602>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.


http://nagoya.apache.org/bugzilla/show_bug.cgi?id=5602

Parsing gets very slow with lots of entity references

[EMAIL PROTECTED] changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
            Product|Xerces-J                    |Xerces2-J
         Resolution|REMIND                      |
            Version|1.4.4                       |2.0.0 [beta 4]



------- Additional Comments From [EMAIL PROTECTED]  2002-01-04 08:30 -------
It looks like the performance improvement in Xerces 2.0.0B4 might actually be a 
bug.  Most of the time in Xerces 1.4.4 (or in Xerces 2.0.0B3) is spent in 
appending chunks of text.  With the large number of entity references, those 
parsers (in the non-deferred DOM case) receive a chunk of text between entity 
references, then a predefined entity reference, then a chunk of text between 
references, etc.  Each of those chunks is appended onto a single TEXT_NODE.

In Xerces 2.0.0B4 the AbstractDOMParser is creating an ENTITY_REFERENCE_NODE 
for each reference to a predefined entity and TEXT_NODEs for the chunks of text 
in between.  This eliminates all of the overhead of appending.  However, that 
behaviour is not permitted by the DOM API Recommendation[1].

As soon as we fix that problem, we'll be able to reproduce the performance 
problem in Xerces-2, so I'll update the product and version, and reopen the 
bug.  I'm not sure whether we'll be able to do anything about it, though.

I was also able to reproduce the DOMException with j_caesar.xml that you 
mentioned; I'll open a separate bug for that.

[1]:  http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-
20001113/introduction.html#ID-E7C30824

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

DO NOT REPLY [Bug 5602] - Parsing gets very slow with lots of entity references

Reply via email to