Jeff Thanks for the suggestion, but it made no difference to the timings. To be honest, this is what I would expect based on experience with other databases. Building an index with the data in place should result in the data being indexed. Building an index with no data, and then adding the data should result in the index being amended during each insert.
What kind of query performance are you experiencing with Xindice? By way of comparison, a Perl script that opens each of the 1000 XML files and does a regular expression search on <size>709</size> finds the matching document in just under one second. No indexes, nothing flash but 10 times quicker than Xindice with its index, and 25 times faster than Xindice without its index. :-( Cheers Paul ----- Original Message ----- From: "Jeff Greif" <[EMAIL PROTECTED]> To: <[email protected]>; "Paul Gowers" <[EMAIL PROTECTED]> Sent: Monday, November 25, 2002 11:05 PM Subject: Re: Xindice performance > Try adding the indexer *before* you add the documents to the collection, and > repeat the test. (You can delete the collection and rebuild it). > Jeff > > ----- Original Message ----- > From: Paul Gowers > To: [email protected] > Sent: Monday, November 25, 2002 2:48 PM > Subject: Xindice performance > > > I have today been playing with Xindice to see what it could do. I have to > say, even with limited hardware (200MHz Pentium Pro running NT4) I was a > little disappointed. I used the command line tools to create 1000 XML > documents based on files on my computer. The documents were very small and > looked like this: > > <?xml version="1.0"?> > <direntry> > <name>f://ffastun.ffl</name> > <size>229376</size> > </direntry> > > Query using command line tools by exact element match (no indexes) took > around 25 seconds. With an index on size, an exact match search took over 10 > seconds. Using -t long made no discernable difference - with or without was > same speed. > > E:\>xindiceadmin add_indexer -c /db/files -n sizeindex -p size -t long > CREATED : sizeindex > > E:\>xindice xpath -c /db/files -q "/direntry[size=709]" > <?xml version="1.0"?> > <direntry xmlns:src="http://xml.apache.org/xindice/Query" > src:col="/db/files" src:key="file500.xml"> > <name>f://user_data/HELEN/children/action man_files/aman_link.gif</name> > <size>709</size> > </direntry> > > I would be interested to hear what other users' experiences are. It is > entirely possible I have screwed something in the config. I have been trying > it "out of the box". That said, 10 seconds to find by exact match on an > indexed value which is pretty selective was disappointing. I am used to > RDBMS performance, which would be sub-second even on my woeful hardware. > > Cheers > Paul > > >
