Daniel Veillard <[EMAIL PROTECTED]> writes:
>> shows this for every document I get back that parses:
>>
>> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
>> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
>> "http://www.w3.org/TR/REC-html40/loose.dtd">
>>
>> Here's the relevant bit of the loader again:
>>
>> # The parserContext and resulting document
>> parserContext = libxml2.parserCtxt(_obj=pctx)
>
> what is pctx ??? i find suspicious the fact you could provide a C parser
> context here.
This is inside a document loader implementation. The parser context is
passed in, here's the function again:
def loader(url, pctx, ctx, type):
doc = None
context_object = None
if type:
context_object = libxslt.stylesheet(_obj=ctx)
else:
context_object = libxslt.transformCtxt(_obj=ctx)
# The parserContext and resulting document
parserContext = libxml2.parserCtxt(_obj=pctx)
doc = None
if url == "/one":
doc = parserContext.htmlCtxtReadFile("file2.html", "UTF8", 1)
else:
doc = parserContext.ctxtReadDoc("""<document>
<h1>this is xml</h1>
</document>""", url, "UTF8", 0)
return doc
And here's the set:
try:
libxslt.setLoaderFunc(loader)
except Exception, e:
# Whoops! serious error
Note the pctx in the loader arg list.
>
>> doc = None
>> if url == "/one":
>> doc = parserContext.htmlCtxtReadFile("file2.html", "UTF8", 1)
>> else:
>> doc = parserContext.ctxtReadDoc("""<document>
>
> just use htmlReadFile and forget about trying to address directly the
> parser context. With python overhead you won't gain anything to create
> a separately accessible object. The less you touch things though Python
> the better it will be, really. That said HTML parsing works for me when
> using htmlReadFile.
>From a loader?
I thought it was not possible. I'll try it and see!
--
Nic Ferrier
http://www.tapsellferrier.co.uk for all your tapsell ferrier needs
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml