Hello there, I ran into a minor problem using the xml.dom.minidom XML parser: An XML document having a comment before a DOCTYPE node seems to leave the DOM data structures in an inconsistent state.
Let's say I have a little test.xml file: <?xml version="1.0"?> <!-- comment --> <!DOCTYPE test SYSTEM "test.dtd"> <test> <tag2> Hello world </tag2> </test> and a little Python program to parse it: from xml.dom.minidom import parse dom = parse("test.xml") print "document node:", dom print len(dom.childNodes), "children" print "first child:", dom.firstChild print "next sibling:", dom.firstChild.nextSibling The output of that program is: document node: <xml.dom.minidom.Document instance at 0xb7b82b6c> 3 children first child: <DOM Comment node " comment "> next sibling: None I.e. the document node does have three children (a comment node, a DocumentType instance and an element), but the first child's nextSibling pointer isn't set correctly. This breaks my algorithm, which is supposed to recursively walk the entire DOM tree, but stops after the first node instead. I'm not entirely sure whether this really is a bug in pyexpat or an error in my XML file. I haven't found any hints whether an XML document is allowed to have comment before the DOCTYPE declaration. xmllint doesn't seem to complain about it, though. Cheers, Ingo _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig