It's safe to change the definition of XMLCh to wchar_t on Solaris, with
one caution: if you ever encounter xml data with a code point (Unicode
character code value) > 64k, there will be a problem. These characters
are stored as a "surrogate pair" of two 16 bit values in utf-16 encoding,
and that is what Xerces uses. With 32 bit wchar_t on Solaris, they should
be stored as a single 32 bit value. Having two utf-16 surrogate pair
values in two adjacent 32 bit wchar_t s is illegal, but this is what will
happen.
This concern is mostly theoretical at this point, but the latest Unicode
spec is introducing a bunch of new characters up in this range.
The possible problem is portability - not systems use a Unicode based
encoding for their wchar_t (HPUX, for example).
Andy Heninger
IBM XML Technology Group, Cupertino, CA
[EMAIL PROTECTED]
----- Original Message -----
From: "Majeed, Zartaj" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, March 22, 2001 9:12 AM
Subject: Can I typedef XMLCh for wchar_t on Solaris?
> I would like to build libxerces-c1_4.so with XMLCh typedef'd for wchar_t
> instead of unsigned short. I'd like to know if that is a safe change to
make
> in
> util/Compilers/SunCCDefs.hpp?
>
> I need to do this to avoid conversions between wide strings and
DOM_Strings
> in my Solaris CC application. I expect my typedef to be okay as long as
> xerces
> does not internally make any assumptions about the type of XMLCh besides
it
> being
> large enough to store UTF-16 values.
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]