On Fri, Jul 27, 2007 at 11:06:36AM +0200, Stefan Behnel wrote:
> Hi,
> 
> one of the lxml users noticed that libxml2 changes behaviour when you set the
> NONET option for xmlCtxtReadFile() and then call it twice on a network URL.
> The first time, it parses the external document. The second time, it refuses
> to parse it.
> 
> The problem lies in the handling of the parser options, which are only set
> *after* the first call to xmlLoadExternalEntity(), in the following call to
> xmlDoRead(). I think this is ok in general as it allows users to parse from a
> URL by passing it in but to avoid additional network access when loading
> external entities transitively (DTDs etc.) - is this the intended semantics of
> the NONET option?

  Hum, no. The NONEt semantic is that any access outside the local filesystem
should genrate an error. Note that if you have a catalog remapping external
resources to local ones, then they should proceed without failure.

> Now, the thing is, when you reuse the parser context, then the options *stay*
> in the context when you use it the second time, so they will be picked up by
> the xmlLoadExternalEntity() call when running xmlCtxtReadFile() a second time.

  That's weird.

> Depending on how contexts are reused in an application, this can lead to
> unpredictable behaviour. In lxml, we can work around this by resetting the
> context options after parsing, but I would like to see the intended semantics
> of the NONET options cleared up and see reliable behaviour here.

  In general you should always reset the parsing context, like xmlCtxtRead*
function do.

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to