I have a strange bug in an xml file. I try to parse it with the following script:
import cElementTree as ElementTree ##from elementtree import ElementTree inputFile = "c:\Test0\\bug.xml" tree = ElementTree.ElementTree() tree.parse(inputFile) root = tree.getroot() iter = root.getiterator() for element in iter: print element.tag The file is here: <?xml version="1.0" encoding="us-ascii"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <DIV class="paragraph" style=" padding:14.88pt 36.00pt 0.00pt 69.84pt; text-align:justify;"> <SPAN class="font8" style=" line-height:12.00pt;">It is should be adapted on the fly and should be <BR /> <SPAN style=" letter-spacing:0.50pt;">affected as minimal as possible during <BR /></SPAN>adaptation. Adaptation approaches are different <BR />in the adaptation granularity (procedure, module, <BR /> <SPAN style=" letter-spacing:-0.45pt;">(simplicity,</SPAN> <SPAN style=" letter-spacing:-0.45pt;">duration,</SPAN> <SPAN class="font1"> </SPAN> <SPAN style=" letter-spacing:-0.45pt;">automation,</SPAN></SPAN> </DIV> I get the following error: Traceback (most recent call last): File "C:\Python23\Lib\site-packages\pythonwin\pywin\framework\scriptutils.py", line 310, in RunScript exec codeObject in __main__.__dict__ File "C:\Test0\FineReader\FixXml.py", line 7, in ? text += gettext(e) File "<string>", line 24, in parse SyntaxError: not well-formed (invalid token): line 17, column 22 I can't understand why! I think my xml file is wrong (when I remove the corresponding line, eveything is ok). Can anydoby tell me where is the mistake ? Thank you, Bruno Lienard _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig