yuppie wrote:
Hi Yves!

Yves Bastide wrote:
GenericSetup has problems handling non-ASCII data.

1.) GenericSetup explicitly doesn't support non-UTF-8 XML in profiles. UTF-8 is the default encoding for XML and I can't see a need to support other XML encodings.

As output, right? Agreed.

2.) GenericSetup explicitly doesn't support non-UTF-8 site settings. If someone provides a good patch this feature can be added.

But with the problems you mention later ('default_charset', 'management_page_charset', and so on), how would you envision it?

3.) GenericSetup is not tested with non-ASCII UTF-8 site settings. AFAIK import works, but not export. I consider this a bug.

Neither: CMF trunk, change portal_types/Document's title to 'Dôcument', export:

Traceback (innermost last):
  Module ZPublisher.Publish, line 115, in publish
  Module ZPublisher.mapply, line 88, in mapply
  Module ZPublisher.Publish, line 41, in call_object
  Module Products.GenericSetup.tool, line 471, in manage_exportAllSteps
  Module Products.GenericSetup.tool, line 272, in runAllExportSteps
  Module Products.GenericSetup.tool, line 736, in _doRunExportSteps
Module Products.CMFCore.exportimport.typeinfo, line 198, in exportTypesTool
  Module Products.GenericSetup.utils, line 728, in exportObjects
  Module Products.GenericSetup.utils, line 722, in exportObjects
  Module Products.GenericSetup.utils, line 501, in _exportBody
  Module xml.dom.minidom, line 62, in toprettyxml
  Module StringIO, line 271, in getvalue
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf4 in position 20: ordinal not in range(128)

It treats strings sometimes as ASCII, sometimes as UTF-8, yet it has access to two variables: its own ISetupContext.getEncoding() (whose use I didn't fully grok) and CMF's ISetupContext.getSite().getProperty('default_charset').

Sorry, but your assumptions are wrong:

- The default setup tool creates export contexts without specifying the encoding, so ISetupContext.getEncoding() returns always None. And even if it would be set it represents the encoding of the exported files, not the site encoding.

- getSite().getProperty('default_charset') is CMF specific and should not be used in GenericSetup.

- The adapters adapt ISetupEnviron, not ISetupContext. getEncoding() and getSite() are not always available.

Thanks for setting me right. What's the usefulness of getEncoding()? As you say, exported files don't need to be other than utf-8 encoded.

First of all we need unit tests that make sure UTF-8 works and I think this should be the default used by GenericSetup. Code that needs to know how to find the site encoding can't be generic.


There is an additional problem: If tools use the default property edit page from OFS the properties might have a different encoding than 'default_charset' of the site. Since the default 'management_page_charset' is UTF-8 we have less trouble if we allow only UTF-8.

D'oh! /manage is 8859-15, /manage_menu is -1 and manage_propertiesForm UTF-8. No wonder Firefox sometimes gets confused :-)

Well, I think I can wriggle out of most of my problems using translation. And I'll try and write UTF-8 unit tests if nobody beats me to it.





Zope-CMF maillist  -  Zope-CMF@lists.zope.org

See http://collector.zope.org/CMF for bug reports and feature requests

Reply via email to