On Thu, Feb 4, 2010 at 12:53 AM, Daniel Veillard <[email protected]> wrote: > On Wed, Feb 03, 2010 at 08:34:09PM -0800, Aaron Patterson wrote: >> I can't seem to pass an encoding to xmlParseInNodeContext. This is >> problematic when dealing with UTF-8 HTML documents. I can tell >> libxml2 what encoding to use when originally parsing the document, but >> it looks like that is completely ignored when using >> xmlParseInNodeContext. Reference nodes in HTML documents completely >> ignore the original document encoding and use ISO-8859-1. >> >> Here is a sample program to illustrate the problem: >> >> http://pastie.org/808860 >> >> I tried putting together a patch, and it didn't seem to work: >> >> http://pastie.org/808862 >> >> Ideally, I would like a function similar to xmlParseInNodeContext, but >> one that takes an encoding as a parameter. Thanks! > > Rather than add Yet Another Entry Point, I think the most logical > is to parse using the encoding from the document, since it's an "in > context" parsing, i.e. parsing as if the fragment was coming from that > document. The encoding switch is a bit harder than what you hoped for, > but it's not that hard, the patch enclosed seems to do it for me, please > have a try.
Perfect. It works great for me! Thank you very much! Any suggestions for workarounds to older versions of libxml2? I'm tempted to copy this function to my C code, but I'd rather not if possible. -- Aaron Patterson http://tenderlovemaking.com/ _______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
