On Wednesday 12 January 2005 20:12, Fredrik Lundh wrote: > several people have asked for libxml2 figures, since libxml2 is known as > the fastest parser under the sun (with the possible exception of RXP, which > is known as quite possibly the fastest parser anywhere). > > here's an updated table:
[...] > libxml2 16000k 0.098s > cElementTree 0.8 5700k 0.058s cElementTree looks really impressive, but having run various tests comparing libxml2 and cElementTree with some of the larger test documents in the libxml2 distribution, libxml2 still seems faster. I've used GNU time to report things like the elapsed, system and user times as well as measuring the elapsed time in Python, but I couldn't get the memory usage. How should one go about getting these figures under Linux? Should I turn process accounting on or something like that? One thing that may explain the discrepancy between the above results and the ones I've been getting is the unfortunate need to explicitly free each libxml2 document after finishing with it - I found that otherwise libxml2 does indeed get slower after loading a few documents, and I'd imagine that the memory requirements start to affect my resource-challenged laptop as a result. Of course, this depends on how one does the tests, but in order to diminish start-up times and to time a single process loading many documents, I looped over a number of files, parsing each one, looping over this entire process many times. Still, cElementTree looks like a very promising addition to the range of Python XML tools, especially given the uncomplicated installation process (compared to some of the other top performers, notably libxml2 and cDomlette). Paul _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig