----- Original Message -----
From: "Chris Withers" <[EMAIL PROTECTED]>
To: "Matt Hamilton" <[EMAIL PROTECTED]>
Cc: "Casey Duncan" <[EMAIL PROTECTED]>; "Steve Alexander"
<[EMAIL PROTECTED]>; "Wolfram Kerber" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Wednesday, November 28, 2001 09:27
Subject: Re: [Zope-dev] Catalog improvements
> Matt Hamilton wrote:
> >
> > I would like in on that too :) About a year or so ago I was working on
a
> > full-text indexing system for indexing several gigabytes of text
(mailing
> > list archives). Most of it was written in C and uses quite a lot of
cool
> > algorithms from various information retrieval papers and books. I have
> > been hoping to have the time to take parts of it and work it into the
new
> > PluginIndex architecture. The existing code uses BerkeleyDB files to
hold
> > the index structures, but I would like to use ZODB instead to give it a
> > bit more modularity.
>
> Hi Matt,
>
> Are any of these algorithms publicly available? I'd be _very_ interested
in them
> :-)
>
I think the software "MG" from the book "Managing Gigabytes" is GPLed and
currently
released as mg-1.21. Walking through the TOC of the book, it seems to be a
very detailed
sources about text processing and gives very much informations about
different indexes types.
But I miss some explanations about current data structures like suffix
arrays or suffix tree
that have several advantages for text processing compared to B-Trees.
Andreas
---------------------------------------------------------------------
- Andreas Jung Zope Corporation -
- EMail: [EMAIL PROTECTED] http://www.zope.com -
- "Python Powered" http://www.python.org -
- "Makers of Zope" http://www.zope.org -
- "Life is a fulltime occupation" -
---------------------------------------------------------------------
_______________________________________________
Zope-Dev maillist - [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
** No cross posts or HTML encoding! **
(Related lists -
http://lists.zope.org/mailman/listinfo/zope-announce
http://lists.zope.org/mailman/listinfo/zope )