Anstey, Matthew wrote: > Our question is this: when we finish porting our 300Mb "python" data > into 3Gb of XML data, how can we continue to read it from disk in its > xml format and manipulate it? > > We are looking at Berkeley XML with the Python API, but are concerned > this is not the best solution. we have also dabbled with Amara and > ElementTree, but the size our our XML is giving us problems.
if the Python version of the data fits in memory, you can use iterparse and the "incremental decoding" approach outlined here: http://effbot.org/zone/element-iterparse.htm to save the data, you can build subtrees (e.g. on a record level) and write each tree out by itself. f = open("out.xml", "w") f.write("<data>") for record in data: tree = make_record_tree(record) tree.write(f) f.write("</data>") f.close() </F> _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig