On 1/3/13 1:50 PM, Claudiu Saftoiu wrote:
Hello all,

Am I doing something wrong with my queries, or is repoze.catalog.query very slow?

I have a `Catalog` with ~320,000 objects and 17 `CatalogFieldIndex`es. All the objects are indexed and up to date. This is the query I ran (field names renamed):

And(InRange('float_field', 0.01, 0.04),
InRange('datetime_field', seven_days_ago, today),
        Eq('str1', str1),
        Eq('str2', str2),
        Eq('str3', str3),
        Eq('str4', str4))

It returned 15 results so it's not a large result set by any means. The strings are like labels - there are <20 things any one of the string fields can be.

This query took a few minutes to run the first time. Re-running it again in the same session took <1 second each time. When I restarted the session it took only 30 seconds, and again 1 second each subsequent time.

What makes it run so slow? Is it that the catalog isn't fully in memory? If so, is there any way I can guarantee the catalog will be in memory given that my entire database doesn't fit in memory all at once?

I can't speak from experience, but using http://pypi.python.org/pypi/zc.zlibstorage could help initial loads by speeding disk I/O and network transfer.

To make sure the catalog is in memory you could keep it in a separate ZODB with different cache size.

David
_______________________________________________
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev

Reply via email to