-- Mark J Stang System Architect Cybershop Systems
--- Begin Message ---Yes, I am replying to my own e-mail. Woke up this morning and realized that the sample code listed below doesn't have a loop. The overhead of connecting to the database, creating creating a collection object, etc. is in every execution of this code. Depending on the OS/hardware, I have seen this code, executed one time take anywhere from 2 to 6 seconds. I tracked it down to the creation of the Database Manager and the Collection. The Collection Pool class I wrote initializes that code one time. Then code that actually does the queries reuses these objects. That is where the overhead is. Executing this sample program n times will have the same overhead as the command-line version. Put a loop in and do searches n times. Either the same one or differerent ones and time the queries. Use the System class to get the current time, getCurrentMillis() or something like that.. hth, Mark "Mark J. Stang" wrote: > Only register the database once and reuse your collections. > > hth, > > Mark > > Matthew Van Horn wrote: > > > On Tuesday, November 12, 2002, at 11:47 AM, Jeff Greif wrote: > > > > > I think I see the problem. I believe (if I've read the source code > > > correctly) that in xindice, your index on <w> is essentially a map from > > > values of <w> to document keys. > > > ...snippage... > > > Clearly, if I'm not confused, xindice is optimized for small documents > > > with > > > not much repeating structure, and its indexing mechanism is not > > > optimal for > > > the type of query you're doing. > > > > While this explanation may help for Beni's issue, my documents _are_ > > fairly small with a minimum of duplicate elements, yet my queries are > > taking 3 -4 *minutes*. I finally broke down and tried this > > programmatically instead of from command line. The following program > > runs the query in about 3:30. However, thinking that the first query > > might be untypically slow, I tried again and had it run a few slightly > > different queries in one run, and they all take that long. Does anyone > > know where I can start to look for possible causes of this, and ways to > > improve? > > > > package foo; > > > > import org.dom4j.io.*; > > import org.xmldb.api.*; > > import org.xmldb.api.base.*; > > import org.xmldb.api.modules.*; > > > > public class QueryRunner { > > private static DOMReader xmlReader = new DOMReader(); > > private static Database database = null; > > > > public static void main(String[] args) { > > String xpath = "/candidate[biographic_data/id='ANON2021']"; > > org.w3c.dom.Node node; > > org.dom4j.Node newnode; > > try { > > Class xindiceDriver = > > Class.forName("org.apache.xindice.client.xmldb.DatabaseImpl"); > > database = (Database) xindiceDriver.newInstance(); > > DatabaseManager.registerDatabase(database); > > Collection col = > > DatabaseManager.getCollection("xmldb:xindice:///db/resumes"); > > XPathQueryService service = (XPathQueryService) > > col.getService("XPathQueryService", "1.0"); > > System.out.println(String.valueOf(new java.util.Date())); > > ResourceSet rs = service.query(xpath); > > System.out.println(String.valueOf(new java.util.Date())); > > ResourceIterator ri = rs.getIterator(); > > while (ri.hasMoreResources()) { > > node = ((XMLResource) ri.nextResource()).getContentAsDOM(); > > System.out.println(xmlReader.read((org.w3c.dom.Document) > > node).asXML()); > > } > > } > > catch (Exception e) { e.printStackTrace(); } > > } > > } > > -- > Mark J Stang > System Architect > Cybershop Systems -- Mark J Stang System Architect Cybershop Systemsbegin:vcard n:Stang;Mark x-mozilla-html:TRUE adr:;;;;;; version:2.1 email;internet:[EMAIL PROTECTED] fn:Mark Stang end:vcard
--- End Message ---
begin:vcard n:Stang;Mark x-mozilla-html:TRUE adr:;;;;;; version:2.1 email;internet:[EMAIL PROTECTED] fn:Mark Stang end:vcard
