Hi all,

as a newcomer I've spent some time reading the wp-polyglots archives and found many interesting discussions about encoding .po files as UTF-8 and the use of HTML character entities.

It seems to me that it is a common practice to use HTML character entitites for all special characters in translated messages. On the other hand the translation guidelines say that one should avoid using HTML character entities:

   With a few exceptions (noted below), all translations should be
   written literally, rather than escaping accented and special
   characters with HTML character entities.

Source: http://codex.wordpress.org/Translating_WordPress#Guidelines_and_requirements

I try to sum up:

  1. .mo files without HTML entities do not work for blogs using other
     character encodings than UTF-8 (the later being the default and
     recommended in WP).
  2. .mo files with HTML entities do not work for e-mail messages sent
     by wordpress.
  3. .po files with HTML entites are less translator-friendly and thus
     more error-prone.

As Kim Suominen pointed out on March 7th, 2005, the best solution would be the WP core to translate UTF-8 into the blog's character encoding on runtime (both when generating html and e-mails). See http://comox.textdrive.com/pipermail/wp-polyglots/2005-March/000449.html

At the translation files I've worked on (catalan for WP 2.2, 2.2.1 and 2.2.2) I've followed this approach:

   * translated strings in .po files contain no HTML character entities
     (original strings are obviously left with entities untouched)
   * a Perl script I wrote generates an equivalent .po file with HTML
     character entities in translated strings
   * there are 2 deployed versions of the WP catalan translation: the
     "normal version" (just for UTF-8 blogs, works fine with e-mail),
     the "html version" (works with all blog character encodings,
     produces "ugly" error messages)

Do you think this approach could be generalised for all WP localizations?

By the way I think the common practice today does not meet the guidelines - we should change one of both to let them accord.

Cheers,

Francesc Hervada-Sala





_______________________________________________
wp-polyglots mailing list
[email protected]
http://lists.automattic.com/mailman/listinfo/wp-polyglots

Reply via email to