Hi, James Sulak wrote: > I'm attempting to use xml.sax.utils.XMLFilterBase and XMLGenerator to > take an input XML document, filter out certain elements, and output > the result to a second XML file. I have it mostly working, except > that I lose the DTD declaration and anything (processing instructions > or comments) before the root element. I believe I'm supposed to be > using a LexicalHandler to get the information from the DTD, but I have > not been able to figure out how to do this, or how to integrate it > with the rest of the code. > > I'm pretty new at using Python (and SAX, for that matter) to work with > XML
Try lxml's iterparse() instead of SAX. It will build an in-memory tree (including the DTD or its reference if you want, see the parser docs), but you can remove the unwanted elements from the tree while it parses. It's still pretty memory friendly and definitely a lot easier to work with than SAX. http://codespeak.net/lxml/parsing.html#iterparse-and-iterwalk http://codespeak.net/lxml/tutorial.html#parsing-from-strings-and-files Stefan _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig