On Thu, Mar 13, 2008 at 10:56:38PM +0100, Frantisek ZACEK wrote:
> Hi,
> 
> I need to handle XPath in a application to select some specific nodes
> from an XML tree.
> 
> The problem is, that the files involved might be quite large (>1GB) so
> DOM parsing is not an option.

  Then you made a mistake allowing XPath to be used to query the files.
Basically someone did a design decision without understanding what it
meant, yes that's painful, there is only one solution to avoid this kind
of problem: learning before designing and implementing.

> Now, libxml2 handles XPath with DOM, and from what I read in the
> archives of this very mailing list, XPath support requires to know the
> whole tree.
> Why is that ? I mean; I did read the W3 recommandation about XPath,
> and I still don't get it. It should be possible to support XPath with
> a SAX parser, ... I mean, .. why not ? or at least an XPath support
> that only needs a schema definition...


  Asking the question seems to indicate you either:
    - think only at the XPath queries *you* are interested into
    - don't understand yet the full expressiveness of XPath
in the first case, you ask the wrong question, in the second case I
suggest you re-read the XPath spec. BTW XPath 1.0 is unrelated to schemas,
but if you are interested in the topic i suggest you read the paper of
Layaïda and Co in PLDI07
   http://wam.inrialpes.fr/people/layaida/research/

> Still, it is true that all uses I have found of XPath seemed to use
> DOM which is unconceivable for too large files.

  Again i think you miss a lot of the expressiveness power of XPath when
making this assertion e.g.:
  //foo[last()]/preceding-sibling::bar
Evaluating an XPath is in gneral not possible in a single pass.
For single pass reduced lookup with a tiny support of XPath, there is
support in libxml2 http://xmlsoft.org/html/libxml-pattern.html
and the xmlReader also has the possibility to use XPath on subset of
the tree http://xmlsoft.org/xmlreader.html#Mixing

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to