I am parsing an XML file with Python 2.6.5 minidom in Windows and it is
mostly working but minidom seems to have problems dealing with Windows
cr/lf characters. It creates an extra textnode that needs to be ignored
instead of just returning the xml elements. I have tried different
methods of opening the file but it doesn’t seem to make a difference. It
is happiest when reading a file in Unix format.

*Wayne Peterson **|** Consultant
Sierra Systems

Wayne,

It sounds to me like you're doing everything correctly.

- XML files are text files, and should be read as text.

- In the absence of a DTD, all whitespace is regarded as significant. Typically this means yes, there will be a text node between consecutive element nodes.

- The XML processor is required to return end-of-line as a single '\n', regardless of which OS or programming language.

If you are traversing every node, you'll need to explicitly ignore the text nodes. More usually you don't have to deal with them, because you know what nodes you're looking for and pick them out with GetElementsByTagName.


_______________________________________________
XML-SIG maillist  -  XML-SIG@python.org
http://mail.python.org/mailman/listinfo/xml-sig

Reply via email to