On Fri, 2007-02-23 at 21:25 +0100, Lennart Regebro wrote:
> On 2/23/07, Philipp von Weitershausen <[EMAIL PROTECTED]> wrote:
> > It may require a bit of hacking the catalog, of course. Perhaps it's
> > time to start thinking about componentizing the Zope 2 catalog to make
> > such things easier in the future?
> Yup. It would also be interesting to look into making it faster which
> huge datasets, something that is a problem now. I think it's because
> you search each index separately and intersect the results, and only
> then pick out the 20 first results.
It is a "making it faster" urge that led me to thinking about caching
results. I'm curious about your use case, the size of your dataset, and
how you think Lucene might help you.
We have an application that have about a million objects catalogued.
With only a few objects in the catalog, a search take about 1
millisecond. This decreases logarithmically to 20 milliseconds for 500
000 objects and about 21 milliseconds for 1 million objects. 20
milliseconds is fast enough for most of our use cases, except for one
use case where we add about 100 objects in a single transaction. These
objects have Archetype references that lead to a massive amount of
To be fair this is an Archetypes problem and not a catalog one, but it
did proof to be an interesting optimisation exercise that lead me to
thinking about caching ZCatalog results. In this particular case
creating 100 objects lead to about 1000 catalog searches taking 20
milliseconds each. That is 20 seconds in total.
So given the above, a application with a million objects using the
ZCatalog can basically do 50 catalog searches in a second, if it wants
to remain responsive to the user. Maybe this is more than enough, I
don't know, but with apps like Plone that relies heavily on the catalog,
optimisation of catalog operations can surely help improving
Upfront Systems http://www.upfrontsystems.co.za
Zope-Dev maillist - Zope-Dev@zope.org
** No cross posts or HTML encoding! **
(Related lists -