Daniel Veillard wrote:
> On Mon, Sep 10, 2007 at 09:45:10AM +0200, Stefan Behnel wrote:
>> Hi,
>>
>> there isn't currently an API function for resetting a push parser context for
>> the HTML parser. However, resetting it for reuse doesn't seem to be trivial.
>> It looks like I have to run htmlCtxtReset() and then create and set up an
>> input stream (in a pretty ugly way, according to the Create code...). This
>> could well motivate an official function.
>>
>> I also thought about using the xmlCtxtResetPush function, but then I stumble
>> over things like the spaceTab setup (which is currently a sure crasher for
>> me).
>>
>> Is there anything else I have to do to implement this functionality by hand?
>> And: is there an easier way?
>
> Honnestly I don't know. I don't see why xmlCtxtResetPush() would not
> work for an html parser context.
In case others are interested, the code below works for me (Pyrex code, but
should be readable).
Stefan
cdef int _htmlCtxtResetPush(xmlparser.xmlParserCtxt* c_ctxt,
char* c_data, int buffer_len,
char* c_encoding, int parse_options) except -1:
# libxml2 crashes if spaceTab is not initialised
if _LIBXML_VERSION_INT < 20629 and c_ctxt.spaceTab is NULL:
c_ctxt.spaceTab = <int*>tree.xmlMalloc(10 * sizeof(int))
if c_ctxt.spaceTab is NULL:
python.PyErr_NoMemory()
c_ctxt.spaceMax = 10
# libxml2 lacks an HTML push parser setup function
error = xmlparser.xmlCtxtResetPush(c_ctxt, NULL, 0, NULL, c_encoding)
if error:
return error
# fix libxml2 setup for HTML
c_ctxt.progressive = 1
c_ctxt.html = 1
htmlparser.htmlCtxtUseOptions(c_ctxt, parse_options)
if c_data is not NULL and buffer_len > 0:
return htmlparser.htmlParseChunk(c_ctxt, c_data, buffer_len, 0)
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml