I believe on Windows that XMLCh is the same as wchar_t (ie they're both
UCS-2 - though XMLCh might actually be UTF-16, which is the same as UCS-2
for the Basic Multilingual Plane; see Appendix C of the Unicode Standard
version 3).  If you don't have to support any other platforms, you can
probably get away with not transcoding at all - but the Xerces API doesn't
make any guarantees about the encoding used for XMLCh.

The transcode method transcodes to the code page currently in effect.
That's probably not what you want; what you want to do is transcode XMLCh to
UCS-2, I think, so that you can use the "W" versions of Windows file APIs
and so forth - ie create wchar_t strings from XMLCh strings.  (Most of the
time, this transcoding will just be the identity function, on Win32, but IMO
it's best not to make assumptions about XMLCh encoding.)

To do that you'll need to create a UCS-2 transcoder, and then feed it the
string in blocks (the transcoders created by the factory have fixed-size
buffers).  I'm not a Xerces expert by any means, but something like the
following seems to work:

XMLTransService::Codes Result;
XMLTranscoder *Transcoder =
   XMLPlatformUtils::fgTransService->makeNewTranscoderFor(
       "UCS2"      // encoding name - not sure this is right
      ,Result      // error return
      ,4096        // block size
      );

[Check Result - see docs.  In the following, "pChars" and 
"pwcValue" are defined as in your code.]

unsigned int NumChars = XMLString::stringLen(pChars);
unsigned int NumTrans, TotTrans;
unsigned int BufLen, BufPos = 0;

// Allocate buffer
BufLen = NumChars + 1;
pwcValue = new wchar_t[BufLen];

for (TotTrans = 0; TotTrans < NumChars; TotTrans += NumTrans)
{
   BufPos += Transcoder->transcodeTo(
       pChars + TotTrans
      ,NumChars - TotTrans
      ,reinterpret_cast<unsigned char *>(pwcValue) + BufPos
      ,BufLen - BufPos - 1
      ,NumTrans
      ,XMLTranscoder::UnRep_RepChar
      );

   if (BufPos == BufLen - 1) break;
}


Warning: that's completely untested; it may not even compile.  It should
suffice to point you at the right bits of the documentation, though.
Unless, of course, I'm completely wrong and this isn't the right way to go
about it.


-- 
Michael Wojcik
Principal Software Systems Developer, Micro Focus


> -----Original Message-----
> From: Fred Grafe [mailto:[EMAIL PROTECTED] 
> Sent: Friday, June 06, 2003 10:38 AM
> To: [EMAIL PROTECTED]
> Subject: Xerces and Domino DSAPI : Internation characters, 
> Unicode and UTF-8 issue
> Importance: High
> 
> 
> Hi there,
> 
> I am sending XML data that contains international characters 
> to a domino server(on win 2k) using the http server and DSAPI.
> For example, following example would contain the japanese 
> character (say 0x65E5)
> My XML data would look something like this
> 
> <?xml version="1.0" encoding="utf-8"?>
> <method name="somMethodName">
> <prop name="0Firstname type="string">
> Hi?
> </prop>
> </method>
> 
> (the question mark is just a place holder for the japanese character)
> 
> I need to extract the prop value and place it in a WCHAR 
> becuase the method I'm calling is expecting wide
> characters. Do I need to do any conversions from XMLCh to 
> WCHAR.  I figure I would use the XMLString::Transcode method, 
> then call mbcstows method to convert to wchar  I've read in 
> the some post that XMLCh has the same type def as.
> 
> Below is my characters method from my SAX2 handler
> 
> void DsapiSax2Handler::characters
> (
> const XMLCh* const pChars,
> const unsigned int uiLength
> )
>       {
>       switch (m_uiParserState)
>               {
>               case STATE_EXPECT_METHOD:
>                       {
>                       printf("Expect method");
>                       }
> 
>                       break;  //      STATE_EXPECT_METHOD
>               case STATE_EXPECT_PROP:
>                       {
> 
>                       char* pValue = XMLString::transcode(pChars);
> 
>                       printf("Looking in pValue\n");
> 
>                       int i = 0;
>                       while (true)
>                               {
>                               if (pValue[i] == 0)
>                                       {
>                                       printf("Breaking out of 
> loop 1\n");
>                                       break;
>                                       }
> 
>                               int ch = pValue[i];
>                               printf("pValue=%d\n", ch);
> 
>                               i++;
>                               }
>                               
>                       unsigned int uiReqSize = 0;
>                       unsigned int uiConvertedBytes = 0;
> 
>                       uiReqSize = mbstowcs(0, pValue, MB_CUR_MAX);
>                       printf("Required size of wchar = %d\n", 
> uiReqSize);
> 
>                       WCHAR* pwcValue = (WCHAR*) 
> malloc((sizeof(WCHAR) * uiReqSize) + 1);
>                       memset(pwcValue, '\0', (sizeof(WCHAR) * 
> uiReqSize) + 1);
> 
>                       uiConvertedBytes = mbstowcs(pwcValue, 
> pValue, uiReqSize);
>                       printf("Number of converted bytes = 
> %d\n", uiConvertedBytes);
>                       printf("wide character: %lS\n\n", pwcValue);
> 
>                       printf("Looking in pwcValue\n");
> 
>                       i = 0;
>                       while (true)
>                               {
>                               if (pwcValue[i] == 0)
>                                       {
>                                       printf("Breaking out of 
> loop 2\n");
>                                       break;
>                                       }
> 
>                               int ch = pwcValue[i];
>                               printf("pwcValue=%d", ch);
> 
>                               i++;
>                               }
> 
>                       printf("Done\n");
>                       free(pwcValue);
>                       XMLString::release(&pValue);
>                       }
> 
>                       break;  //      STATE_EXPECT_PROP
>               default:
>                       break;
>               }
>       }
> 
> thanks
> fred
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to