On Thu, Feb 4, 2010 at 12:53 AM, Daniel Veillard <[email protected]> wrote:
> On Wed, Feb 03, 2010 at 08:34:09PM -0800, Aaron Patterson wrote:
>> I can't seem to pass an encoding to xmlParseInNodeContext.  This is
>> problematic when dealing with UTF-8 HTML documents.  I can tell
>> libxml2 what encoding to use when originally parsing the document, but
>> it looks like that is completely ignored when using
>> xmlParseInNodeContext.  Reference nodes in HTML documents completely
>> ignore the original document encoding and use ISO-8859-1.
>>
>> Here is a sample program to illustrate the problem:
>>
>> http://pastie.org/808860
>>
>> I tried putting together a patch, and it didn't seem to work:
>>
>> http://pastie.org/808862
>>
>> Ideally, I would like a function similar to xmlParseInNodeContext, but
>> one that takes an encoding as a parameter.  Thanks!
>
>  Rather than add Yet Another Entry Point, I think the most logical
> is to parse using the encoding from the document, since it's an "in
> context" parsing, i.e. parsing as if the fragment was coming from that
> document. The encoding switch is a bit harder than what you hoped for,
> but it's not that hard, the patch enclosed seems to do it for me, please
> have a try.

Perfect.  It works great for me!  Thank you very much!

Any suggestions for workarounds to older versions of libxml2?  I'm
tempted to copy this function to my C code, but I'd rather not if
possible.

-- 
Aaron Patterson
http://tenderlovemaking.com/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to