Hash: SHA1

I havn't received much feedback on the ZPT mailing list, so I
thought I'd bring it over here to a wider audience (thread is at
http://mail.zope.org/pipermail/zpt/2004-March/005218.html ).

Begin forwarded message:

From: Stuart Bishop <[EMAIL PROTECTED]>
Date: 29 March 2004 6:13:06 PM
To: Dieter Maurer <[EMAIL PROTECTED]>
Subject: Re: [ZPT] Makeing PageTemplate's edit pages Unicode aware

On 27/03/2004, at 9:57 PM, Dieter Maurer wrote:

Stuart Bishop wrote at 2004-3-25 12:27 +1100:
Currently, if you enter non-ascii text into the title or contents
fields on a PageTemplate's edit page, the data ends up stored as
an encoded string (using management_page_charset, if it is set. Unknown
encoding if it is not).

This should be easy to fix using the foo:charset:ustring notation
to have Zope convert the encoded strings to Unicode. However, the
file upload  feature is more problematic. Should the file upload
try converting the file to Unicode from UTF-8 and raise an exception
if this is not possible? I personally feel this is preferable to
ending up with arbitrarily enncoded document source, with no idea
of the character set used.

I do not think that Zope should convert when it does not know the
encoding. I am unaware that a missing "management_page_charset"
can be interpreted as "UTF-8". If this were the case, converstion
to unicode might be correct. By the way: the HTML specification
says that uploaded files should come with a "content-type" declaration.
In this case, the charset specified there (if any) should be used
to determine the encoding.

Yes - A missing management_page_charset should probably be interpreted as either US-ASCII or ISO-8859-1. US-ASCII is probably more correct, but I would guess that most browsers will be configured to use ISO-8859-1 as their default (and this might be specified in the HTML spec?)

I guess using the charset type the browser tells us for file uploads
means we can blame the browser. I don't know how this could be reliable
(since text files themselves don't encode their character set unless
they happen to be UTF-16 or have a BOM). I am wondering if having a
file upload  function is incompatible with a Unicode aware page
templates product.

If management_page_charset is not set, it is unknown what charset
is being used. The only way of knowing the character set of data that
has been submitted is to know the character set of the form that it
was submitted from. All other mechanisms do not work due to
incompatibilities in how the browsers work.

Currently, if you create a page template that contains non-ASCII
characters, any tal:content or tal:replace expressions that return
Unicode will now raise a Unicode error. This can be demonstrated
      <div>My 2</div>
      <div tal:content="python:u'My 2\N{CENT SIGN}'">My 2</div>
These are the things I think need to be fixed in Zope's Page Templates
implementation to make them Unicode aware. There may be more (?):

        - It should be possible for the actual page template source to
                be stored as a Unicode string. Currently, there is an assert
                ensuring it is a traditional string.

- The title property should be a Unicode string.

        - PageTemplateFile should grow an optional charset parameter,
          defaulting to US-ASCII.

        - PageTemplate.write(text) should raise an exception if text
          is not either a Unicode string or an ASCII string.

    - The ZopePageTemplate edit page should use Zope's
          :charset:ustring notation so Unicode strings get passed
          to its handler.

        - The file upload widget needs to either be removed, or grow
          a charset box. I don't think either of these solutions are
          ideal :-(

Note that when I say 'Unicode string', we can still store ASCII
text using a traditional string to save space.

My application is currently using a ZopePageTemplate subclass that
has been modified to use Unicode strings for the document source
and title, and it seems to be functioning just fine. Does anyone
know if that "assert type(text) == type('')" in PageTemplate.write
is there for a reason?

- -- Stuart Bishop <[EMAIL PROTECTED]>
Version: GnuPG v1.2.3 (Darwin)


_______________________________________________ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )

Reply via email to