On 1/3/13 1:50 PM, Claudiu Saftoiu wrote:
Hello all,
Am I doing something wrong with my queries, or is repoze.catalog.query
very slow?
I have a `Catalog` with ~320,000 objects and 17 `CatalogFieldIndex`es.
All the objects are indexed and up to date. This is the query I ran
(field names renamed):
And(InRange('float_field', 0.01, 0.04),
InRange('datetime_field', seven_days_ago, today),
Eq('str1', str1),
Eq('str2', str2),
Eq('str3', str3),
Eq('str4', str4))
It returned 15 results so it's not a large result set by any means.
The strings are like labels - there are <20 things any one of the
string fields can be.
This query took a few minutes to run the first time. Re-running it
again in the same session took <1 second each time. When I restarted
the session it took only 30 seconds, and again 1 second each
subsequent time.
What makes it run so slow? Is it that the catalog isn't fully in
memory? If so, is there any way I can guarantee the catalog will be in
memory given that my entire database doesn't fit in memory all at once?
I can't speak from experience, but using
http://pypi.python.org/pypi/zc.zlibstorage could help initial loads by
speeding disk I/O and network transfer.
To make sure the catalog is in memory you could keep it in a separate
ZODB with different cache size.
David
_______________________________________________
For more information about ZODB, see http://zodb.org/
ZODB-Dev mailing list - ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev