As some of you may know, I've been scouring the Net for some way to display Unicode text accurately in webpages. The only application I had been able to find, which would allow me to edit Unicode text on the server, was SubEthaEdit. It's a wonderful app., but a few html tools usually save me a fair bit of time. I lobbied html developers and cursed a lot (non-productive but therapeutic at times). Strangely enough, even apps that said they supported Unicode, would not display my Vietnamese characters properly. I was using the Vietnamese keyboard from the default OSX install Unicode bundle.

However, now I have found the answer to my hassles: single-code-point characters. This may well fix problems for users of other languages.

Apparently, the default Vietnamese Unicode keyboard that is installed with OSX, uses two Unicode code points to describe a character: one for the vowel, another for the accent. Very few apps. can tolerate this, and for html, only SubEthaEdit could edit it accurately, while only Safari would display it properly.

A search under "Unicode" and "Vietnamese" brought up a package in Apple's OSX Downloads. It contained additional keyboards. The documentation stated:

All [these] keyboard layouts emit Vietnamese text in Unicode Normalization Form C (NFC), where the entire vowel including the intonation mark is represented by a single code point.
In comparison, the original Mac OS X Vietnamese keyboard layout emits Unicode text where vowels with an intonation mark are represented by a code point for the vowel itself (a, Ä, Ã, e, Ã, i, o, Ã, Æ, u, Æ, y) followed by a second code point representing the intonation mark as combining diacritic.
UnicodeChecker (http://www.earthlingsoft.net/), makes a service "Convert to Unicode Normalization Form C" available to TextEdit and other applications. VietPad (http://vietpad.sourceforge.net/) can do this, too.

What happens when I use one of these "single-code-point" Unicode keyboards? Everything displays perfectly in Unicode-aware applications, _and_ in webpages.


I am really delighted, because this has been a long, hard struggle for me, which I resent just a little, because of the claims of OSX and many apps to "support Unicode fully". It does seem that full Unicode support is a fair way off yet.

Here is my current list.

Applications which support Unicode fully, including on webpages: Safari, SubEthaEdit, StyleMaster (CSS)

Applications which display Unicode perfectly in their documents:

Apple: Finder, Mail, iCal, AddressBook, TextEdit,

Stairways group: Interarchy (ftp), Keyboard Maestro (macros), URL Manager Pro (sic), Web Confidential (sensitive data)

Other developers: iData (database), LaunchBar (finder utility), JNotes (note app.), NetNewsWire (RSS), UniLingua (vocab tester), Daktari (html), MoosePad (small db), VietPad

Applications which only display single-code-point Unicode (UNFC) properly: BBEdit v8, Linguist (translation), Psi (Jabber), OmniWeb v5, Mozilla, Netscape and probably many others

Apple apps which do not yet support Unicode at all: AppleWorks!

I will try to find time to upload a page which covers this more fully, since what info there is online is out of date, and certainly doesn't include UNFC. I would like to be able to include a fuller table of applications' support of Unicode, and information on how it affects other languages, so please email me if you have info on, for example, how iTunes handles Unicode, or how Arabic displays using the default OSX keyboard.

For now, it's good news: we have a work-around for incomplete Unicode support. (Now, I just have to adjust to a keyboard layout where some things are reversed... but it's worth it!)

Sorry about such a long post. I tried reducing it, but it didn't cover the subject adequately.

from Clytie, whose Google needs retreading after all the searching she's done on this topic

This specific message posted to the following lists concerned: OmniWeb-l, South Australian Apple Users' Group, Web Standards Group, CSS-d, Web Authoring List (BareBones), SubEthaEdit Users, Interarchy Users and NetNewsWire Users.

Clytie Siddall -- thÃnh phÃÌ Renmark, taÌi miÃÌn SÃng cuÌa Ãc Nam

******************************************************
The discussion list for  http://webstandardsgroup.org/

Proud presenters of Web Essentials 04 http://we04.com/
Web standards, accessibility, inspiration, knowledge
To be held in Sydney, September 30 and October 1, 2004

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
******************************************************



Reply via email to