Well, working from limited knowledge of BSTRs, you can assume that it is
identical to a wchar_t* and assume that it will be null terminated. In this
case you can easily create a DOMString from it as (given our assumption) a
BSTR is identical to an XMLCh* on windows platforms. Since you are talking
about a BSTR, I am assuming that you are working on windows. Note that on
other platforms there may be a difference between the native wchar_t and
XMLCh (for example Solaris and HP-UX).

The null termination and length issues show up only if an application
chooses to have embedded nulls in the data. A BSTR is capable of containing
embedded nulls but this is application dependent. COM does not throw
embedded nulls into the string. It just gives callers a uniform
allocation/de-allocation mechanism.

Here's a link to BSTR documentation on MSDN.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/automat/htm
_hh2/chap7_2xgz.asp

IMHO, for most applications, you can probably assume that a BSTR is
identical to a wchar_t* or XMLCh* on windows. Hence no transcoding is
necessary. Note that this is not portable, but as I said, since you are
working with BSTRs, portability is probably not a concern?

Samar
-----Original Message-----
From: Bavishi, Pankij [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, February 12, 2002 14:51
To: [EMAIL PROTECTED]
Subject: RE: BSTR


Hello Ellis,
Thanks a lot for a detailed explanation.
Yeah it is very helpful.
But right now on practical grounds, I need to know how I can convert the
BSTR data into something that Xerces-C++ can understand. 
Right now I am only aware of something like DOMString.
Do you have any suggestions like using transcode() etc..
If yes  could you please detail it out.
Thank you so much
pankaj
 
-----Original Message-----
From: Mr Ellis Birt [mailto:[EMAIL PROTECTED]] 
Sent: Monday, February 11, 2002 9:10 AM
To: [EMAIL PROTECTED]
Subject: Re: BSTR
 
Pankij, 
The previous replies to your question did not really explain why BSTR and
DomString are not compatible.  I hope this helps you to understand why:
The only difference between a Unicode BSTR and an ordinary LPWSTR is that
the memory allocated to the BSTR includes the two bytes (May be 4 bytes -
don't manipulate directly!) before the one that is pointed to (these hold
the length of the string).  The buffer that holds the characters can contain
nulls, but they do not determine the end of the string.  In my experience,
however almost all BSTR stings do have a null in the first unused character.
DomString is a class (much like the MFC CString and C++ standard library's
string) that makes the handling of variable length string data much simpler
for the programmer.  Its underlying data type is XMLString, which has a
memory image similar to that of BSTR.  Indeed you can get away with using a
BSTR with a null following the last character as a parameter into Xerces
functions that do not modify the value (Xerces does not know about the
length information and cannot update it).  
This is the same problem with trying to use a BSTR instead of a DomString
(since DomString's methods do not know about the BSTR length information).
It would not make sense for the Xerces developers to add support because
BSTR does not exist in the other platforms supported by Xerces.
When using BSTR in place of other (null-terminated) Unicode string types,
remember that as Microsoft says in the documentation for SysAllocStringLen:
"The pch string can contain embedded null characters and does not need to
end with a NULL" 
This could be your undoing if you are not careful! 
NB if you try the reverse (passing a DomString cast to a BSTR) the function
you are calling is going to look for the non-existent length information
just before the string (you are getting into serious problems here and
asking for different behaviour between debug and release builds)
A little bit of trivia: BSTR comes from Visual Basic which has stored its
variable length stings like this (but using bytes for characters) since the
early days (useful to know when you are manipulating a string passed from VB
to a C++ DLL).  When 16-bit OLE automation came out, BSTR was the obvious
format because it already existed in VB.  This was them migrated to 32 bit
with the change to wide characters.
I hope this helps. 
Ellis 
Mr Ellis J Birt Dip. Comp.(Open) 
Applications Developer 
StruMap, Geodesys Ltd, Astwood House, 1296 Evesham Road, Astwood Bank,
Redditch, Worcestershire B96 6AD, UK 
Tel: +44 (0) 1527 893758  Fax: +44 (0)1527 893833 


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to