Hi On 5 August 2013 18:50, Fred <fred.fre...@gmail.com> wrote:
> I have an app that emits XML as 8859-1 (or other encoding as needed), and > the XML is sent to an Oracle database where the XML is unpacked and the > contents used to update an existing schema. > > I apparently fail to understand something about how char encodings work at > the intersection of XML and Oracle. > > If I send: > > <?xml version="1.0" encoding="WINDOWS-1252"?> > <MSG> > ... > <LAST_NAME>BOLA<C3><C9>OS</LAST_NAME> > ... > </MSG> > > the two accented characters are each transformed into 0xBF. (with exactly > the same result if it's 8859-1 instead of WINDOWS-1252.) > > however, if I send: > > <LAST_NAME>BOLAÃ ÉOS</LAST_NAME> > > I get the desired result. > > While I'm working on figuring out what I'm doing wrong regarding Oracle, > is there some way I can force libxml2 to emit the second form rather than > the first? > > the tree is output using: > xmlDocDumpFormatMemoryEnc (doc, xmlbufptr, &xmlbufptr_size, > "WINDOWS-1252", 1); > What happens if you use ascii instead of WINDOWS-1252? Windows-1252 and iso-8859-1 can include those characters as is, whereas if the document is encoded as ASCII, they will need to be escaped, so in theory libxml will escape them. I haven't tried, though. > thanks! > > Fred > -- Michael Wood <esiot...@gmail.com>
_______________________________________________ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml