On Tue, Sep 11, 2007 at 01:26:30PM +0200, Stefan Behnel wrote:
> 
> Daniel Veillard wrote:
> > On Mon, Sep 10, 2007 at 09:45:10AM +0200, Stefan Behnel wrote:
> >> Hi,
> >>
> >> there isn't currently an API function for resetting a push parser context 
> >> for
> >> the HTML parser. However, resetting it for reuse doesn't seem to be 
> >> trivial.
> >> It looks like I have to run htmlCtxtReset() and then create and set up an
> >> input stream (in a pretty ugly way, according to the Create code...). This
> >> could well motivate an official function.
> >>
> >> I also thought about using the xmlCtxtResetPush function, but then I 
> >> stumble
> >> over things like the spaceTab setup (which is currently a sure crasher for 
> >> me).
> >>
> >> Is there anything else I have to do to implement this functionality by 
> >> hand?
> >> And: is there an easier way?
> > 
> >   Honnestly I don't know. I don't see why xmlCtxtResetPush() would not
> > work for an html parser context.
> 
> In case others are interested, the code below works for me (Pyrex code, but
> should be readable).
> 
> Stefan
> 
> 
> cdef int _htmlCtxtResetPush(xmlparser.xmlParserCtxt* c_ctxt,
>                             char* c_data, int buffer_len,
>                             char* c_encoding, int parse_options) except -1:
>     # libxml2 crashes if spaceTab is not initialised
>     if _LIBXML_VERSION_INT < 20629 and c_ctxt.spaceTab is NULL:
>         c_ctxt.spaceTab = <int*>tree.xmlMalloc(10 * sizeof(int))
>         if c_ctxt.spaceTab is NULL:
>             python.PyErr_NoMemory()
>         c_ctxt.spaceMax = 10

  xmlCtxtResetPush should instead be fixed to cope with that condition.

>     # libxml2 lacks an HTML push parser setup function
>     error = xmlparser.xmlCtxtResetPush(c_ctxt, NULL, 0, NULL, c_encoding)
>     if error:
>         return error
> 
>     # fix libxml2 setup for HTML
>     c_ctxt.progressive = 1
>     c_ctxt.html = 1
>     htmlparser.htmlCtxtUseOptions(c_ctxt, parse_options)
> 
>     if c_data is not NULL and buffer_len > 0:
>         return htmlparser.htmlParseChunk(c_ctxt, c_data, buffer_len, 0)

  If you think a real C function htmlCtxtResetPush might be useful, then
as usual I take patches ! :-)

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to