On Sun, Nov 26, 2006 at 09:33:09AM -0500, Elliotte Harold wrote:
> What happens when libxml, invoked via xmlreader (itself invoked via 
> PHP's XmlReader) detects a well-formedness error? How is the error 
> reported to the client application?

  Either with the global default or with 
    http://xmlsoft.org/html/libxml-xmlreader.html#xmlTextReaderSetErrorHandler

> In my experiments it seems that the read method merely returns false.

  no libxml2 always raise an error

> If 
> that's true, is there a way to distinguish between this case and the 
> simple end of the document?

  The reader end of document should not be a -1 return, but 0

> A related question: Theoretically, the parser could report data up to 
> the first error it finds. In my experiments with small documents, 
> however, it actually errors out immediately.

  I think a lot of what you are seeing is specific to PHP for which
unfortunately I can't comment.

> I suspect the underlying 
> parser is preparsing a large chunk of the document, caching it, and then 
> doling it out a piece at a time. Thus it tends to detect errors 
> prematurely. Is this accurate?

  That's how libxml2 operates underneath.

> If so, is there a limit to how much it will preparse? I assume it's not 
> loading the whole document into a DOM first, and then iterating through 
> that.

  No unless you ask for it. The amount buffered depends on a number of factors
mostly the document, and poitentailly other things like RNG validation.

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to