David Shi wrote: > What I am trying to do is to have a generic script to turn xml to Python > dataset. Then I can manipulate it as required. Then I can save > processed data into a .dbf file.
I'd use iterparse() for the parsing, that allows you to construct the .dbf content on the fly. http://codespeak.net/lxml/parsing.html#iterparse-and-iterwalk Working with the data elements returned by the iterparse iterator is quite easy, you'll be fine with using the properties .tag and .text, as well as the .find() method to find subelements. http://codespeak.net/lxml/tutorial.html#the-element-class If you can afford to load the entire XML tree into memory, you can also try lxml.objectify, which will give you a Python-like interface to the data. http://codespeak.net/lxml/objectify.html Note that the lxml.objectify in-memory tree is most likely a lot more memory friendly (and the parsing is definitely faster) than what the recipe gives you. Stefan _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig