--On 14. Januar 2007 18:14:45 +0000 Chris Withers <[EMAIL PROTECTED]> wrote:
Dieter Maurer wrote:A halfway intelligent parser would accept Unicode when it gets it and concentrate on the remaining part of its task: either reporting structural events or building a parse tree.The trivial fix I use in Twiddler is as follows: if isinstance(source,unicode): source = source.encode('utf-8') Of course, this assumes a heading of either <?xml version="1.0" encoding="utf-8"?> or a missing encoding attribute, in which case the xml spec states that the string must be utf-8 encoded.
The encoding of the XML preamble should not matter when parsing a XMLdocument stored as unicode string. It is of importance as soon as you convert the document back to a stream e.g. when we deliver the content back to a browser or a FTP client. The ZPublisher (for Zope 2) deals with that by changing the encoding parameter of the preamble for XML documents based on the desired output encoding. utf-8 is always a good choice however
other encodings like iso-8859-15 might raise UnicodeDecodeErrors. The Zope 2publisher "avoids" this problem converting the unicode result using errors='replace' (which is likely something we might discuss :-))
Description: PGP signature
_______________________________________________ Zope3-dev mailing list Zope3firstname.lastname@example.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com