> I have a lot of Chinese XML-files stored in Zope. The 
> internal encoding is
> UTF-8. Everything was fine in 214, 216 and 220a1. Now with 220b1, some
> characters are (apparently?) randomly turned into lt;, gt; 
> and the like. Now
> this looks like some unwanted HTML escaping, but the leading 
> '&' is missing
> and the characters are definitely all in the range greater 
> 127 (this is a
> property of UTF8), so there is no direct relationship to the 
> codepoints of
> >, < and co.
> Any ideas what could have gone wrong here?

Yes - during the alpha period we got a bug report concerning the fact 
that Netscape browsers honor the windows "extended Latin-1" characters 
\213 and \233 (which are < and >). That means that if you don't filter 
those as a part of html_quote 'ing then some Netscape versions are 
open to the same sort of script-kiddie attacks that they would be if 
the HTML was not quoted at all :(

I'm not quite sure what the right answer is here. How are you using
the html_quote format in your application?

Brian Lloyd        [EMAIL PROTECTED]
Software Engineer  540.371.6909              
Digital Creations  http://www.digicool.com 

Zope-Dev maillist  -  [EMAIL PROTECTED]
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope )

Reply via email to