RE: XPathAPI performance problems

Walsh, Eric 18 Jan 2002 16:30:27 -0000

Title: RE: XPathAPI performance problems

Thanks for the info. Reading the explaination of DTM on
apache.org, it seems funny that this was done to speed up
XPath Querying, it seems querying an object model you already
have would be much more effecient than building a new one.

Your CachedXPathAPI would help in the example I sent, but there
are many cases where we do only one query or are querying a
document that is in constant change thus rendering CachedXPathAPI
less effective. As much as I hate to say this, we will not
being using any version past 2.0 until this is resolved, it will
crush our performance in several apps (and yes we use xpath ALOT).

Thanks again for the explanation,
EricW

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Friday, January 18, 2002 11:10 AM
To: [EMAIL PROTECTED]
Subject: RE: XPathAPI performance problems

> takes about 20ms. It seems like the XPath routine is starting
> from the top of document, no matter if you give it a parent or
> not.

That may in fact be true in the current code.

Your XPath may want to search upward/backward from the starting node
(ancestor or previous axes). That means those nodes need to be in our data
model.

DOM2DTM's tables are currently constructed linearly, in document order. To
construct a model which includes those ancestors, we need to construct
them before we get to your starting node.

DOM2DTM incremental, so when you look for a node early in the document tree
there isn't a lot of preconstruction done. When your starting point is late
in the document, we do have to build everything up to that point. This
matches the response-time curve you're seeing.

A version of DOM2DTM which allows constructing nodes in other than document
order is theoretically possible. The unsolved problem is that there is no
good portable way to go from a DOM node object reference to a DTM Node
Handle, and without that getting any reasonable performance out of this
beast is challenging.

(DOM Level 3 was considering the introduction of a node identity number,
which would have been perfect for this application. Now they're back to
looking at hooks for UserData, which could be made to work but might be
less efficient. Xerces currently has a solution which they use internally,
but it isn't part of their official API so I've been reluctant to try to
leverage it... and it'd still work only for Xerces DOMs.)

Meanwhile, the best suggestion I can offer is still CachedXPathAPI.
Starting from a node toward the end of the document will still require
going through the whole document first, but at least you will be able to
take advantage of any model construction work already done by earlier
queries.

RE: XPathAPI performance problems

Reply via email to