Hi everybody! I have some trouble with SAX and encondings...
When I try to parse the following XML-code: <?xml version="1.0" encoding="WINDOWS-1252" ?> </TRANSACTION> <TRANSACTION TIME="03.04.2003 01:52:15" TIME_CODED="37714.0779513889" DURATION="1001"> <QUESTION>K'R®</QUESTION> ^^^^^^^^^^^^^^ ... I get this error message. <SNIPP> self._err_handler.fatalError(exc) File "C:\Python24\Lib\site-packages\_xmlplus\sax\handler.py", line 38, in fatalError raise exception SAXParseException: xml_temp.xml:3766:13: not well-formed (invalid token) ... Here you can find the python-code I use: http://knopaste.de/index.php?module=hilight&id=142 Maybe the encoding of the content between the xml-elements is mismatching from the encoding specified. As I have to parse quite a lot of log files (~1GB zipped), and there are only a handful of such errors I would be very happy when I could find a way to tell sax just not to worry and write the string anyway. Parsing the xml-code with the MS-XML-DOM, or a JAVA-based parser is not a problem, but I would prefer a solution in Python. Thanks, Daniel _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig