On Sun, Jul 10, 2011 at 06:26:58PM -0400, Noam Postavsky wrote:
> Jon <jon.for...@gmail.com> writes:
> 
> >> In many cases you don't even need that. Write a shell XML file,
> >> 
> >> <!DOCTYPE wrapper SYSTEM "the-dtd-file.dtd" [
> >>   <!ELEMENT wrapper the-real-root-element>
> >>   <!ENTITY the-real-document SYSTEM "bigfile.xml">
> >> ]>
> >> <wrapper>&the-real-document;</wrapper>
> >
> > Will the libxml2 implementation try to bring the entire &the-real-document; 
> > entity into memory, or will it stream it if I use the SAX2 or Reader API?  
> > My gut tells me both the dtd and the bigfile.xml will be completely parsed 
> > into memory. This is fine for the dtd but not for the bigfile.xml.
> 
> A reading of xmlParseReference suggests your gut is wrong. :)
> 
> http://git.gnome.org/browse/libxml2/tree/parser.c#n6823

  Yeah I would think that for a extrernal parsed entities we create a
new input stream and feed it to the parser, hence progressingly.
This may work in constant memory for SAX but unfortunately I'm afraid
that for the reader we still build a tree for the entity content
(stored in ent->children), so yes we do it progresively, but no
unfortunately we accumulate the tree in memory :-\

  The real solution would be to allow DTD validation from a preparsed
DTD at the xmlreader level directly. For my excuse, validating from
a DTD not referenced from the document is not a scenario actually
described by XML-1.0, and the way it's implemented will diverge slightly
from when you reference with a DOCTYPE. Which is why I think the
cleanest is to use a custom I/O which will automatically add the DOCTYPE
at the beginning of the document, that's the safest and fastest at this
point in my opinion.

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
dan...@veillard.com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to