On Wed, 28 Mar 2012 18:36:19 +0200, Shawn Steele
<[email protected]> wrote:
PUA == "Private Use Area", so people can show whatever glyphs they want
for whatever PUA code point they want. It's more like per-font or
something than per-locale. Different documents could use different
fonts to show different things.
We map those to the Unicode PUA, there's no better Unicode code point.
Per
http://www.microsoft.com/download/en/details.aspx?DisplayLang=en&id=12080
that seems untrue. "Legacy Unicode-encoded HKSCS documents and data
records must be converted to Unicode 4.1 in order to remove the PUA code
points. Secondly, most Big5-encoded HKSCS documents and data records will
need to be converted to Unicode 4.1 to work properly on Windows and to
take advantage of new characters provided in HKSCS-2004."
FWIW: We have a mechanism where we allow "EUDC" characters to be
mapped. The net result is that people can cause a specific font, of
their own creation, to be used as the fallback for the system for those
unknown PUA characters. For a web site, that'd mean that if they wanted
to use the PUA, they'd either have to use a common convention, or
provide a font. In either case I'd strongly recommend that the web site
developer used Unicode as, particularly in these edge cases, the
differences between implementation make it really hard to be
cross-platform.
Yes, but since big5-hkscs has code points in the same place as those PUA
code points and Microsoft has shipped custom glyph mapping before for
HKSCS (now claimed to be integrated in Windows), it does actually matter
for other players how the default setup works in Windows for Hong Kong and
Taiwan.
--
Anne van Kesteren
http://annevankesteren.nl/