Hello,
I'm a newbie to XML, just wrote a program that can store my scientific
data objects as an XML file and restore them later (like marshaling).
However, I found it is extremely slow... I changed the implementation from
minidom to sax. It speeds up somewhat (30% or so) for small files but not enough.
If I go back to using binary data the speed is ~ 5 times faster or more.
Are there widely used ways to speed up parsing?
Another problem is memory footprint. My XML data file can be large:
10s of megabytes with 100 thousands of objects. If I use xml.sax.parseString()
it parses the whole string into memory objects which inflats. I only need to
loop over the objects in the XML file once. Are there common ways to do
a delayed read? I'm looking for something like
xml.sax.parseFile('data0.xml', myContentHandler) objects = myContentHandler.getObjects() # returns an iterator for obj in objects: # reading occurs here (delayed reading) # do something with obj...
But I haven't found any. I'm not sure this is possible with current architecture of parsers. Any advise is highly appreciated.
Thanks, Ping
_______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig