RE: Creating an XSLTInputSource for Unicode byte stream

Chatterjee, Urmi Fri, 19 Apr 2002 06:04:21 -0700

Okay. Thanks Dave!

Urmi

-----Original Message-----
From: David N Bertoni/Cambridge/IBM [mailto:[EMAIL PROTECTED]]
Sent: Thursday, April 18, 2002 5:48 PM
To: [EMAIL PROTECTED]
Subject: RE: Creating an XSLTInputSource for Unicode byte stream

Hi Urmi,

If you have the choice, always go with the byte stream, because the parser
only consumes bytes.

Xalan has no feature to break up the bytes, because that would require
knowing about the "endian-ness" of the stream -- something we'ed rather not
have to worry about.  You probably don't want to worry about that either,
so I'd recommend you avoid std::wistream.

If you decide to go with std::wistream, you'll need to write an adapter
between std::wistream and std::istream that knows about the endian-ness of
the stream, and breaks each wide character into two bytes, feeding the
bytes back in the correct order.  Remember you can have both big-endian and
little-endian streams being parsed on big-endain or little-endian hardware.

Lastly, if you're really developing a "Unicode-compliant" application (I
don't know what you really mean by this, but I'm making a few guesses),
you'll probably want to support UTF-8 as well, which would preclude using
wide character streams.

Dave

                      "Chatterjee,

                      Urmi"                    To:
[EMAIL PROTECTED]

                      <URMI.CHATTERJEE         cc:      (bcc: David N
Bertoni/Cambridge/IBM)                                                  
                      @ca.com>                 Subject: RE: Creating an
XSLTInputSource for Unicode byte stream                               

                      04/18/2002 02:22

                      PM

                      Please respond

                      to xalan-dev

Hi Dave,

I am sorry, I did mean an instance of std::wistream. And yes, I meant
Unicode characters encoded in UTF-16.

Actually, the application that I have to develop is to be Unicode
compliant.
At this point, I do have the option of choosing between using a byte stream
or using a wide character stream as input. Based on what you said, it would
look like using a byte stream (beginning with the correct byte order mark,
and with the correct encoding declaration) would seem like a quicker
option.

However, if I decide to go with an std::wistream instance, is there some
feature in Xalan which would aid me in breaking it up into bytes?

Thanks,
Urmi

-----Original Message-----
From: David N Bertoni/Cambridge/IBM [mailto:[EMAIL PROTECTED]]
Sent: Thursday, April 18, 2002 4:32 PM
To: [EMAIL PROTECTED]
Subject: Re: Creating an XSLTInputSource for Unicode byte stream

Hi Urmi,

I have no idea what a _wstream* is.  Did you mean an instance of
std::wistream?

A byte stream cannot be an instance of a wstream, since the underlying unit
is a wide character.  First, make sure your stream starts with the correct
byte-order mark, the document has the correct encoding declaration, and
initialize a strstream instance with a pointer to the beginning of the
stream and the length _in bytes_ of the stream.  If you're being provided
with an instance of std::wistream, you'll need to provide something which
breaks down wide characters into bytes.  The parser consumes only bytes.

By the way, all this assumes that when you say "Unicode byte stream" you
mean Unicode characters encoded in UTF-16.  Is that correct?

Dave

                      "Chatterjee,

                      Urmi"                    To:
[EMAIL PROTECTED]

                      <URMI.CHATTERJEE         cc:      (bcc: David N
Bertoni/Cambridge/IBM)
                      @ca.com>                 Subject: Creating an
XSLTInputSource for Unicode byte stream

                      04/18/2002 11:12

                      AM

                      Please respond

                      to xalan-dev

Hi,

I have just recently started using Xalan C++ (ver 1.3).

My problem is this. I need to transform xml data using
XalanTransformer::transform(), where my
input is a unicode byte stream in memory of type _wstream *, and the xsl
byte stream is also
a _wstream * type. However, I do not see an  XSLTInputSource constructor
for
the same.

Can you please help me out on the best way to achieve the transformation.
Do
I need to use a transcoder of some sort? I have already built an ICU
enabled
build of Xalan.

Thanks in advance,
Urmi

RE: Creating an XSLTInputSource for Unicode byte stream

Reply via email to