Hello, I tried to find a spatial index which integrates seamlessly into the ZODB. Unfortunately I did not find a satisfying solution anywhere. So I came up with three solutions how this could be implemented:
1) Write a native r-tree package, just like the current BTrees. Would likely have to be written in C for performance. 2) Make use of the existing B+ Trees by using a space filling curve such as the Z-curve or Hilbert curve to transform higher-dimensional data into 1D data which can then be stored in a BTree. Since B+ trees also provide range querying capabilities this should give good query performance. Unsure how much speed-up a C implementation of the insert/query functions would give. More info: http://www.scholarpedia.org/article/B-tree_and_UB-tree and http://www.drdobbs.com/184410998 . 3) Use the already existing Rtree package from http://pypi.python.org/pypi/Rtree . It's a thin wrapper of a C library, so it should be very fast. I can see two methods to make this work: 3a) - Create an rtree.RTree (stored in a separate file) and an OOTreeSet. - inserting: insert item into BTree. Then insert item's oid into Rtree. - querying: user supplies bounding box, rtree is queried, oids are returned. look up objects by oid in BTree. - zeo: does not work out-of-the-box with zeo since the Rtrees on different machines are not synchronized. 3b) - Create an rtree.RTree, a OOTreeSet and an IOTree. Difference to 3a): Create RTree with a custom storage manager (example: http://svn.gispython.org/svn/spatialindex/spatialindex/trunk/src/storagemanager/MemoryStorageManager.h and http://trac.gispython.org/spatialindex/browser/spatialindex/trunk/README#storage-manager ). This storage manager stores each page into the IOTree (key: pageId, value: pageData). - inserting: insert item into BTree. Then insert item's oid into Rtree. Causes storage manager to write out changed rtree pages to IOTree. - querying: user supplies bounding box, rtree is queried, pages for rtree returned from IOTree, oids finally returned from query. look up objects by oid in BTree. - zeo: works out-of-the-box with zeo since the rtree pulls its data from a btree (which is hooked up with zeo). Conclusion: 1) Native r-tree package: It is a lot of work which has already been done before. Bug-prone. Ruled out. 2) Spatial index on top of current BTrees: Looks interesting, could be done in python. Disadvantages: unclear UB tree patent situation, unclear how much work this really is. 3a) Does not work with zeo out-of-the-box. Ruled out. 3b) Requires writing a custom storage manager for the rtree package (likely in C). Provides different trees. Basic technology (rtrees + btrees) is tested. Would it make sense to add a default spatial index to ZODB? Does anybody of you have any experience with one of the mentioned solutions? Is anybody else interested in having a zodb spatial index? -Matthias _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev