"Fredrick Paul Eisele" <[EMAIL PROTECTED]> writes:
> I would like some advice, and possibly a change to xerces-perl.
> I found a bug in perl-5.6.1 which is related to unicode (actually I found
> several).
[snip]
> Given that the fixes to these bugs will not be generally available for
> a while what can be done in the meantime (I would rather not use
> a perl snapshot).
>
> I am thinking that the perl strings returned by the xerces functions
> should be stripped of their UTF-8 nature. This could be done by
> supplying a function which does this, much like transcode already
> does. Or maybe an global option which controls the behavior or
> transcode? What do you think?
Hey Frederick Paul,
Yes, as Andreas pointed out Unicode support works but has bugs in
5.6.1. But as far as I can tell, it works flawlessly in 5.7.2.
First, if anyone wants to use Unicode seriously (including
ISO-8859-1), I would suggest that you upgrade to Perl-5.7.2.
Second I'm happy to add in some kind of support to Xerces to controls
the global behavior of transcoding. Xerces-P already has a ISO-8859-1
transcoder built into, but I just don't use it. So there could easily
be a global variable that any user can set that controls whether
Unicode is used or not.
Understand however, that this is rather low on my priority list. If
either you or Harwin would like to modify the code, I'd be happy to
test it and include it in the next XML::Xerces snapshot. The code that
you need is all in typemaps.i. You can find an example of how to get
SWIG to wrap a C variable for Perl in Xerces.i:
bool DEBUG_UTF8_OUT;
bool DEBUG_UTF8_IN;
Any variable outside of a %{ ... }% gets wrapped as a global Perl
variable:
package XML::Xerces;
*DEBUG_UTF8_OUT = *XML::Xercesc::DEBUG_UTF8_OUT;
*DEBUG_UTF8_IN = *XML::Xercesc::DEBUG_UTF8_IN;
Please add some tests in the t/ directory.
jas.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]