Not having ever looked into ICU, would it be appropriate to add equivalent 
functions for UTF-16 to and from ISO-8859-1 in char* 
(throwing exceptions or a failure code when the source contained 
characters out of the range).  

In my personal DOM implementation, a DOMString 
that only contains code points >= 255 is kept in ISO-8859-1.  [When there
are code points > 255, then UTF-16, or UTF-8 when significantly shorter, is used
and nobody but another DOMString ever sees the internal representation.] 
However, the transcoding needs to be really fast.  

Contrary to Dean, I would think the larger the document, then more 
beneficial it is to use a compressed format and expand to UTF-16 on demand.  

How about wchar* <-> UTF8 and wchar* <-> ISO-8859-1 which I would also use between
the internal encoding of DOMString and external buffers?

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to