Hi Jeff,

Thanks for your detailed response on this. I'm a bit late in replying
beacuse I've just had a few days away from the office ;-)

I think the approach I will take will be based on restricting the use of
wildcards and common words in our application to reduce the hit count. If
the user cannot make searches such as 'all companies with the Character 'A'
anywhere in their name... then we should be able to avoid this problem. I
don't fancy rewriting the XPathQueryResolver in any way ;-) 

--Peter

-----Original Message-----
From: Jeff Greif [mailto:[EMAIL PROTECTED]
Sent: 02 May 2002 20:03
To: [email protected]
Subject: Re: XPath - Limiting the number of results returned


I'm not an Xindice developer, so am more or less guessing about this.

Your first request is probably hard to satisfy without modifying the query
engine (including breaking the standardized xmldb api).  Most of the mods
would be simple -- to pass in a result-cardinality-bound parameter.  The
XPathQueryResolver code would have the substantial mods.  The execute method
would take the upper-bound.  The case where there are no indexes is easy,
since you could just use the upper-bound as a loop test in the collection
scan (if you were willing to get only the first 50 records found, not the
first 50 in key order).  The harder case is with indexes.  Note that I'm not
suggesting that you hack the XPathQueryResolver and all the code that
invokes it -- some kind of design of a parallel, more specialized set of
classes might be better.

Your second request is for a browsing capability.  RDBMS systems implement
the browsing capability using a thing called a cursor, which is essentially
a hook into the results of a query, which must be fully executed to
guarantee result consistency.  Typically the results are stored in some
temporary location in the DB (possibly even in memory) and the cursor acts
like a bookmark on the list.  For different purposes, there are cursors
which live client side and others that live server side (f'rinstance, a
server-side cursor may be used internally to carry out a complex join in an
RDBMS, or inside a stored-procedure where there is server-side code that
iterates through a result set).  There are cursors of varying complexity
(which can be optimized to a greater or lesser degree).  A forward-only
cursor can be optimized by discarding results already seen.  A browsing
cursor might allow you to skip over pages in the results.  There is some
documentation of this kind of thing in the JDBC documentation.  Probably
cursors could be implemented as XMLObjects.

Jeff
----- Original Message -----
From: <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Thursday, May 02, 2002 1:23 AM
Subject: XPath - Limiting the number of results returned


> Hi,
>
> I'm currently using Xindice to store metadata for an ebXML registry
> implementation, and have come across an issue when using XPath to query
the
> contents of a collection. In some cases (using wildcard searches or common
> keys), I can get back very large sets of matching results (say 1000 +
> fragments). For performance reasons, I would like to limit the number of
XML
> fragments returned to say no more than 50 - is there anyway this support
can
> be added to Xindice (I could make the code change myself if needed)? What
> I'm looking for is a way to interrupt an XPath query operation once the
> specified maximum number of matches (i.e. 50) has been reached, and then
> return that set of matched fragments.
>
> Also, is it possible to return 50 results and then cache all other matches
> in the background so that the user can quickly access other results if
> needed, once they've reviewed the first 50 results. I'm looking for the
kind
> of functionality provided by most search engines, where the user can
specify
> how many results they want to see, and can also move through the result
set
> page by page, without the extra overhead of waiting for all 1000 + results
> to be found (which can take ages with large documents)...
>
> Any help would be greatly appreciated.
>
> -- Peter
>
>

Reply via email to