--On 28. Februar 2007 01:21:41 -0700 Jeff Shell <[EMAIL PROTECTED]> wrote:

  - HOW do I know what a browser has sent me? There doesn't seem to be
a real way of handling this. Do I guess?

- Know without a doubt when to encode, and when to decode. I guess the
"proper" thing to do is to store everything as unicode, and to decode
to unicode as early as possible when input is coming in. But again,
how do I know when to decode from latin-1 and when to decode from
UTF-8? When or why should I encode to one or the other at response
time? Should I worry at all?

There general rules are:

- *always* use unicode strings (*NOT* UTF-8 encoded strings) internally for
  storage and processing

- convert incoming data to unicode, convert outgoing data (for presentation) from unicode to some output encoding

- if you run a website using utf8 encoding (you ensure that the content-type header include the 'charset' is set properly) you are on the safe side that the data passed back is utf-8. In addition you might check the 'accept-charset' header of the <form> tag.


Attachment: pgpsjYYYWIfyd.pgp
Description: PGP signature

Zope3-users mailing list

Reply via email to