On Mar 25, 2007, at 11:08 AM, Lennart Regebro wrote:
...
2. Use an N-best algorithm.  If N is the size of the batch and M is
the corpus size, then this is O(M*ln(N)) rather than O(M*ln(M)) which
is a significant improvement if N << M, but still quite expensive.

I don't think relational databases have any magic bullet to get
around sorting costs.  Sorting is expensive.  In many ways, I think
the sorting support in the catalog gave people a false sense of
security.

I don't know if relational databases typically does this internally (I
don't think so). However, some search engines do it, like Lucene. And
supposedly also Dieters IncrementalSearch (haven't used it yet).

Our catalog framework also has N-best support.

JIm

--
Jim Fulton                      mailto:[EMAIL PROTECTED]                Python 
Powered!
CTO                             (540) 361-1714                  
http://www.python.org
Zope Corporation        http://www.zope.com             http://www.zope.org



_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Reply via email to