On Wednesday, May 8, 2002, at 01:04  PM, [EMAIL PROTECTED] wrote:

We have a collection of about 800 documents each about 5 Kb in size.
Upon indexing using the wildcard index, Xindice retrieves queries'
using the Contains function in about 45-60 seconds, which is
unacceptable by Internet standards.  Is there a way to improve on
this performance?  We have tried indexing all elements, all
attributes as well as specific elements only, with no success.  Also,
is there a way to delete an index (other than to delete and then
recreate the collection)?  We know from experience that Oracle
retrieved similar queries using the Contains clause with 100 times
more XML documents stored in CLOBS in less than 15 seconds using
Oracle Intermedia.

Even if Xindice supported full text indexing, which it doesn't yet, contains() queries will *always* result in a collection scan because contains() is not a full text function, but rather, a substring function which requires all text to be scanned. Xindice is *not* a search engine, and though at some point we'll support searches on semi-structured data, for now you're better off just using grep if all you're doing is contains searches.


--
Tom Bradford - http://www.tbradford.org
Architect - XQRL (XQuery Engine) - http://www.xqrl.com
Apache Xindice (XML Database) - http://xml.apache.org/xindice
Labrador (Web Services Hub) - http://www.notdotnet.org/labrador



Reply via email to