Hi

On 5 August 2013 18:50, Fred <fred.fre...@gmail.com> wrote:

> I have an app that emits XML as 8859-1 (or other encoding as needed), and
> the XML is sent to an Oracle database where the XML is unpacked and the
> contents used to update an existing schema.
>
> I apparently fail to understand something about how char encodings work at
> the intersection of XML and Oracle.
>
> If I send:
>
> <?xml version="1.0" encoding="WINDOWS-1252"?>
> <MSG>
> ...
> <LAST_NAME>BOLA<C3><C9>OS</LAST_NAME>
> ...
> </MSG>
>
> the two accented characters are each transformed into 0xBF. (with exactly
> the same result if it's 8859-1 instead of WINDOWS-1252.)
>
> however, if I send:
>
> <LAST_NAME>BOLA&#x00c3; &#x00c9;OS</LAST_NAME>
>
> I get the desired result.
>
> While I'm working on figuring out what I'm doing wrong regarding Oracle,
> is there some way I can force libxml2 to emit the second form rather than
> the first?
>
> the tree is output using:
> xmlDocDumpFormatMemoryEnc (doc, xmlbufptr, &xmlbufptr_size,
> "WINDOWS-1252", 1);
>

What happens if you use ascii instead of WINDOWS-1252?  Windows-1252 and
iso-8859-1 can include those characters as is, whereas if the document is
encoded as ASCII, they will need to be escaped, so in theory libxml will
escape them.  I haven't tried, though.


> thanks!
>
> Fred
>

-- 
Michael Wood <esiot...@gmail.com>
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml

Reply via email to