On Mon, Feb 13, 2006 at 02:07:32PM -0800, Eric Seidel wrote: > I'm reading in data off the network, converting it to utf16, and then > passing it off to libxml2. In the parser4 adapted example, I'm > reading ascii from a local file, expanding it to integers > (effectively utf16) and then passing it to libxml2: [...] > const unsigned BOM = 0xFEFF; > const unsigned char BOMHighByte = *(const unsigned char *)&BOM; > xmlSwitchEncoding(ctxt, BOMHighByte == 0xFF ? > XML_CHAR_ENCODING_UTF16LE : XML_CHAR_ENCODING_UTF16BE);
What did you expect to achieve that way ?!? UTF-16 is one of the encodings that an XML parser must autodetect and use http://www.w3.org/TR/REC-xml/#sec-guessing what you are doing may perfectly well break the internal parser detection. You must not use xmlSwitchEncoding() unless you're an expert in the way libxml2 internals work. So don't do this at least at this stage ! Actually even converting to UTF-16 from the external source it just plain broken, the xml declaration may state that this is some other encoding and then the actual bytes and the declared encoding will conflict, really not a good idea, again unless you really really know what you're doing you should never attempt to work around the parser autodetection code: you're playing with conformance of the parser to the spec so this is on the edge of what is acceptable from client code. Daniel -- Daniel Veillard | Red Hat http://redhat.com/ [EMAIL PROTECTED] | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ _______________________________________________ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml