-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello all,
Over at the PHP documentation project, we use libxml in order to parse and then process our documentation. [1] Recently, some optimization work was done to the loading and resolution of entities inside or XML documents faster; [2] the LIBXML_COMPACT flag was the primary change, and for some people reduced the processing time of 24 MB worth of XML documents spread over thirteen thousand files to a mere five seconds. However, the performance gains have not been uniform; other systems (with comparable or even better hardware specs) still take several minutes to parse and validate our document, with memory usage breaking into gigabytes (for comparison, the optimization only uses 400 MB when it's working properly). These discrepancies don't appear to be tied to libxml version (2.6.26 is one of the ones used on the slow machine) or operating system (Windows Vista and Ubuntu Linux have been shown to have this problem). Any thoughts or ideas as to what may be the cause of these problems? Even if they're not "fixable", it would be nice to know why libxml is much faster on some systems than others. Thank you! [1] You can view the XML parsing code here: http://cvs.php.net/viewcvs.cgi/phpdoc/configure.php?view=markup (scroll to the bottom of the page; the parts from "$dom = new DOMDocument();" and on are the most interesting.) [2] Phpdoc is a giant docbook manual split into files using XML entities. We use LIBXML_NOENT to expand the entities into XML. We also have a number of XIncludes used to do smart duplication of data. - -- Edward Z. Yang GnuPG: 0x869C48DA HTML Purifier <http://htmlpurifier.org> Anti-XSS Filter [[ 3FA8 E9A9 7385 B691 A6FC B3CB A933 BE7D 869C 48DA ]] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHsmTaqTO+fYacSNoRAkBKAJ9nCdSXLzAXtgZEiY4IUkuhLMHt9wCfVp54 pf06Qg5p3GuGqLQyVWeC4fo= =RCHG -----END PGP SIGNATURE----- _______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
