On Tue, Dec 19, 2006 at 09:01:24AM +0100, Andreas Tscharner wrote: > xmlDocDumpMemory(xmlProg, &xmlStr, &xmlStrLen, "UTF-8"); > > > My problem: I have node values containing umlauts (for example: "Früchte"). > Although I specify "UTF-8" as encoding and altough I use > xmlEncodeSpecialChars(xmlProg, "Früchte"), at the time I use it, the encoding > is not yet specified and if I write the buffer to a file, the BOM is written, > but the actual encoding is cp1252 (I'm woring on windows). And if I try to > read the document again, libxml2 complains that the document is not UTF-8, > which is correct (the "ü" in "Früchte" has a value with bit 8 set)
When you build the tree you *must* pass UTF-8 strings each time there is an xmlChar * argument. libxml2 will not try to guess what you passed to it, nor will it try to do on the fly checking or conversion. If you pass an cp1252 encoded string when an xmlChar * argument is required you break the API and this may lead to this kind of problems. You must convert the strings passed though the tree building APIs yourself. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ [EMAIL PROTECTED] | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ _______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
