Tres Seaver wrote at 2008-8-22 16:45 -0400:
>I recall a pre-Zope (for me, 10 years ago) rule of thumb that text
>indexing imposed an order of magnitude of overhead on the actual corpus,
>with improvements possible only via batching or post-processing /
>compresstion (incremental indexing is worst-case).
And this is especially true for indexes supporting a term frequency
based ranking and which uses "IISet" at places where "IITreeSet" were
With "TextIndexNG3", one can get rid of the overhead of
term frequency based ranking (in case one does not need it)
Using "AdvancedQuery" (and parsing the text subqueries oneself),
one can use a "Managable SimpleTextIndex" which
tries very hard to be as efficient as possible for large data sets
(and does not support term frequency based ranking).
For more information about ZODB, see the ZODB Wiki:
ZODB-Dev mailing list - ZODB-Dev@zope.org