A simple test case: $ LANG=pl_PL.ISO-8859-2 python Python 2.4 (#1, Dec 23 2004, 10:29:41) [GCC 3.3.5 (PLD Linux)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from xml.marshal import generic >>> generic.dumps("piątek") '<?xml version="1.0"?><marshal><string>pi\xb1tek</string></marshal>'
"\xb1" is the ISO 8859-2 encoding of "ą". Still, the XML specification makes it clear that "In the absence of external character encoding information (such as MIME headers), parsed entities which are stored in an encoding other than UTF-8 or UTF-16 MUST begin with a text declaration (see 4.3.1 The Text Declaration) containing an encoding declaration". So, the XML obtained above is not well-formed: >>> generic.loads(generic.dumps("piątek")) Traceback (most recent call last): File "<stdin>", line 1, in ? File "xml/marshal/generic.py", line 321, in loads return m._load(file) File "xml/marshal/generic.py", line 331, in _load p.parseFile(file) File "xml/sax/drivers/drv_pyexpat.py", line 68, in parseFile if self.parser.Parse(buf, 0) != 1: xml.parsers.expat.ExpatError: not well-formed (invalid token): line 1, column 40 I'd also like to make a related feature request: >>> generic.dumps(u"czwartek") Traceback (most recent call last): File "<stdin>", line 1, in ? File "/usr/lib/python2.4/site-packages/_xmlplus/marshal/generic.py", line 59, in dumps File "/usr/lib/python2.4/site-packages/_xmlplus/marshal/generic.py", line 104, in m_root File "/usr/lib/python2.4/site-packages/_xmlplus/marshal/generic.py", line 92, in _marshal AttributeError: Marshaller instance has no attribute 'm_unicode' Given XML's well defined character encoding semantics, it would be useful (and IMO pretty straightforward) to support unicode strings by simply encoding them with the document's encoding. -- +----------------------------------------------------------------------+ | Paweł Sakowski <[EMAIL PROTECTED]> Never trust a man | | who can count up to 1023 on his fingers. | +----------------------------------------------------------------------+ _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig