Hello,

I'm a newbie to XML, just wrote a program that can store my scientific
data objects as an XML file and restore them later (like marshaling).
However, I found it is extremely slow... I changed the implementation from
minidom to sax. It speeds up somewhat (30% or so) for small files but not enough.
If I go back to using binary data the speed is ~ 5 times faster or more.
Are there widely used ways to speed up parsing?


Another problem is memory footprint. My XML data file can be large:
10s of megabytes with 100 thousands of objects. If I use xml.sax.parseString()
it parses the whole string into memory objects which inflats. I only need to
loop over the objects in the XML file once. Are there common ways to do
a delayed read? I'm looking for something like


xml.sax.parseFile('data0.xml', myContentHandler)
objects = myContentHandler.getObjects()   # returns an iterator
for obj in objects:    # reading occurs here (delayed reading)
   # do something with obj...

But I haven't found any.  I'm not sure this is possible with current
architecture of parsers.  Any advise is highly appreciated.

Thanks,
Ping

_______________________________________________
XML-SIG maillist  -  XML-SIG@python.org
http://mail.python.org/mailman/listinfo/xml-sig

Reply via email to