On Sun, Nov 26, 2006 at 09:33:09AM -0500, Elliotte Harold wrote:
> What happens when libxml, invoked via xmlreader (itself invoked via
> PHP's XmlReader) detects a well-formedness error? How is the error
> reported to the client application?
Either with the global default or with
http://xmlsoft.org/html/libxml-xmlreader.html#xmlTextReaderSetErrorHandler
> In my experiments it seems that the read method merely returns false.
no libxml2 always raise an error
> If
> that's true, is there a way to distinguish between this case and the
> simple end of the document?
The reader end of document should not be a -1 return, but 0
> A related question: Theoretically, the parser could report data up to
> the first error it finds. In my experiments with small documents,
> however, it actually errors out immediately.
I think a lot of what you are seeing is specific to PHP for which
unfortunately I can't comment.
> I suspect the underlying
> parser is preparsing a large chunk of the document, caching it, and then
> doling it out a piece at a time. Thus it tends to detect errors
> prematurely. Is this accurate?
That's how libxml2 operates underneath.
> If so, is there a limit to how much it will preparse? I assume it's not
> loading the whole document into a DOM first, and then iterating through
> that.
No unless you ask for it. The amount buffered depends on a number of factors
mostly the document, and poitentailly other things like RNG validation.
Daniel
--
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard | virtualization library http://libvirt.org/
[EMAIL PROTECTED] | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml