As I see it, there are two reasons why you might need to transcode.

First, you might need it to access some particular algorithm you need.  You
have some nice tokenizer class, or a regular expression class that takes
char *.  Instead of rewriting the class to take XMLCh*, you transcode,
process and perhaps convert the result back.

The second scenario, is when you are transcoding for output, to display a
string to the user, or write it to a file and need to access a platform
function that assumes ASCII, or some other encoding

As I see it, the C++ standard library deals with the first issue well.  The
char traits classes and the templated std::basic_string class make it
possible to deal with strings abstractly.  Searching, sorting, etc. work
the same, whether your XMLCh is a 8-bit signed char, or a 64-bit unsigned
long.  Writing good, char size independent algorithms is possible and
simple.

The second issue is more complex.  When it comes time to deal with the
issues of encodings, etc. you just have to bite the bullet and do it.

So, while an algorithm may be able to be designed to be independent of a
particular character representation, a program can't escape it for I/O.  My
proposal was to replace DOMString with basic_string<XMLCh> with a possibly
conditional definition of XMLCh.  But I'd be happy if we just used
std::basic_string<XMLCh> where XMLCh was always a 16-bit unsigned integer,
like it is today.  This would allow the use of generic string algorithms,
in the style of the standard library.


-Rob


Julian Pardoe wrote:

>Having XMLChs as some other than chars makes life a major pain.  Suddenly
>all the regular facilities one's used to using aren't there any more.
>Suddenly your having to convert strings before you can pass them to any
>part
>of your existing system.  The answer is of course transcoding: one can
wrap
>every string access in a call to a transcoder.  But this is clumsy and
>ineffecient -- it would be nice if the transcoding were done long before
>the
.input ever reached you!


Reply via email to