Re: [Zope3-dev] Re: zope.tal.xmlparser.XMLParser() dislikes unicode

Chris Withers Sun, 14 Jan 2007 10:14:42 -0800

Dieter Maurer wrote:

A halfway intelligent parser would accept Unicode when it gets it
and concentrate on the remaining part of its task: either reporting
structural events or building a parse tree.


The trivial fix I use in Twiddler is as follows:

if isinstance(source,unicode):
  source = source.encode('utf-8')

Of course, this assumes a heading of either <?xml version="1.0"encoding="utf-8"?> or a missing encoding attribute, in which case thexml spec states that the string must be utf-8 encoded.


The problem comes when someone sends you something like:

u'<?xml version="1.0" encoding="something-else"?><node />'

What should be done then?

Chris

--
Simplistix - Content Management, Zope & Python Consulting
           - http://www.simplistix.co.uk
_______________________________________________
Zope3-dev mailing list
[email protected]
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com

Re: [Zope3-dev] Re: zope.tal.xmlparser.XMLParser() dislikes unicode

Reply via email to