On Tue, 27 Nov 2001, Andreas Jung wrote: > Is this code available for public ?
Sort of :) It used to be around, but the server with it on is currently offline and in need of a new disk controller, so it is not to hand. It is also poorly commented :( and written in very highly optimised (read: illegible) C. The main bits needed from it are the routines to store an retrieve compressed lists of ascending integers (ie. used in indexes). I want to write a python wrapper around them and release a list-like python data structure that will allow efficient storage of indexes. The other bit is the code for doing the cosine ranking similarity comparison in order to rank the documents in order of relevance to a query. Most of the code is taken from the book/code 'Managing Gigabytes' by Witten, Moffat & Bell (http://www.cs.mu.OZ.AU/mg/) The code is quite old now (1999) and designed for quite large systems, or reletively static text (ie. doesn't do incremental indexing very well). I worked on developing a 'forward' index which could be easily updated, and then inverted quite quickly on a regular basis (since it didn't need to parse the source text again). -Matt -- Matt Hamilton [EMAIL PROTECTED] Netsight Internet Solutions, Ltd. Business Vision on the Internet http://www.netsight.co.uk +44 (0)117 9090901 Web Hosting | Web Design | Domain Names | Co-location | DB Integration _______________________________________________ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )