Have you looked at whether you are overflowing the zeocache and having to fetch your catalog from disk too often? The timing mentioned in this thread seems about right for that to be the case.

I have a client with about a 300k small documents, for whom I have separated the catalog from the rest of Data.fs. This means I can have different cache settings for small and large objects. Resulting in markedly faster catalog queries.


--r.

On 27 Oct 2008, at 12:32, Roché Compaan wrote:

On Mon, 2008-10-27 at 13:23 +0100, Jens Vagelpohl wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Oct 27, 2008, at 13:08 , Roché Compaan wrote:

On Sun, 2008-10-26 at 14:07 -0400, Tres Seaver wrote:
- Plone uses too many indexes, and in particular, uses multiple text indexes. Having extra indexes around "just in case" is a sure lose a write time, and may even be expensive at query time (depending on
 the query).

- Particular indexes have performance characteristics based on their
 designed purpose:  for instance, the stock FieldIndex
implementation
 assumes that the number of documents indexed will be >> the
number of
 discrete indexable values.  Using such an index in an application
 domain with a very large set of indexable values probably loses,
and
 in ways which don't show up in early / small-scale testing.

- I'm pretty sure that we haven't yet found the best data structure
for
 "hierarchy indexes" (e.g., the Plone EPI index, or the stock Zope2
 PathIndex, etc.).  Something like a 'trie' might be optimal for
 pure prefix searching of hierarchies.

- I am confident that the TopicIndex is underutiliized:  it does
*all*
 the work for a given query at write time, and can thus be
blindingly
 fast at query time.

- Other special-purpose indexes (e.g., a "recent items" index) would
 be worth a look, especially for applications with large volumes of
 content.

I agree that one should look at improving performance without
caching as
well. But this is a lot harder and takes significantly more
development
and debugging time than introducing some form caching. So I'm not
convinced that it needs to happen in a certain order. If caching gives you lots of performance with little effort now, then why shouldn't you
use it?

It's the typical trade-off. One course is expedient and fast for your
use case now. The other requires more resources, but benefits
everyone. Including those who don't want to depend on yet another
package, like memcached, for performance.

I'm not tied to memcached. We started out using module level caches like
zope.cache.ram but that has obvious problems when using ZEO.

When it comes to integrating anything in Zope itself I'd choose the
latter.

Sure, we're not trying to get this into Zope, we're just sharing our
experience and exploring the territory so that one can produce a third
party package that really help people with the same use case (which I
suspect is quite common one).

--
Roché Compaan
Upfront Systems                   http://www.upfrontsystems.co.za

_______________________________________________
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )

Russ Ferriday - Topia Systems - Open Source content management with Plone and Zope [EMAIL PROTECTED] - office: +44 2076 1777588 - mobile: +44 7789 338868 - skype: ferriday

_______________________________________________
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )

Reply via email to