Hi Folks elementtree is barfing (well to be correct, expat is barfing) with some unicode strings I'm passing through to it ...
eg: self = <ElementTree.XMLTreeBuilder instance>, self._parser = <pyexpat.xmlparser object>, self._parser.Parse = <built-in method Parse of pyexpat.xmlparser object>, data = u'<DIF><Entry_ID>badc.nerc.ac.uk:DIF:NM_HiGEM_yaao...on_Date>2005-02-03</Last_DIF_Revision_Date></DIF>' ExpatError: not well-formed (invalid token): line 1, column 11389 args = ('not well-formed (invalid token): line 1, column 11389',) code = 4 lineno = 1 offset = 11389 For the record, we find [3 <= tau ]in that block ... we also have problem with degree symbols and whatever .. I suspect the problem is that I'm not actually passing an xml document (with a character encoding definition) to ET ... I'm just passing some stuff which is an xml fragment (from a web service interface to a database). Does elementtree and/or expat need to know the encoding to get this right? (which may be a problem coz this could be from anyone's document in any encoding ...) (Sorry, I'm a bit unicode illiterate, and while I appreciate it's something I should know, there is other stuff filling my mind at the moment ...) Bryan _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig