Hi, we at gocept started an open source project (ZPL) in the beginning of this year to make a free implementation of a fail-over/replication mechanism for ZODB/ZEO available.
We envisioned a solution by applying RAID techniques to ZEO servers. The project is therefore called "ZEORaid". We have a working system that I showed off at EuroPython this year and mentioned at some other places. We received interest from some people and parties that would like to see this finished. The current state of ZEORaid is almost alpha quality: two features missing, a few known edge cases, lots of bugs. We've set up a road map for what we propose to have available in ZEORaid 1.0 (see attached ROADMAP file). In the last weeks we were contacted by one sponsor who wanted to help out with this. We estimate an effort that requires 12k EUR for us to finish ZEORaid 1.0 according to the ROADMAP. Our plan involves three participating sponsors who cover 4k EUR each. One of those has been found already, we need two more. Our project plan lays out to have all sponsors on board by 30th September. Work will start in the middle of October and be done in the middle of November. For details I'd be happy to answer your questions (in private or on the list). Christian PS: If there is a preferred feature you'd like to see in ZEORaid but isn't in 1.0, we'd be happy to do another round of funded open source work later.
==== TODO ==== 1.0 === Stabilization ------------- - Check edge cases for locking on all methods so that degrading a storage works under all circumstances. - The second pass of the recovery isn't thread safe. Ensure that only one recovery can run at a time. (This is probably a good idea anyway because of IO load.) - Make sure that opening a ZEO client doesn't block forever. (E.g. by using a custom opener that sets 'wait' to True and timeout to 10 seconds ) Workaround: do this by using "wait off" or setting the timeout in the RAID server config. - Run some manual tests for weird situations, high load, ... Feature-completeness -------------------- - Rebuild storage using the copy mechanism in ZODB to get all historic records completely. (Only rebuild completely, not incrementally) - Create a limit for the transaction rate when recovering so that the recovery doesn't clog up the live servers. Cleanup ------- - Remove print statements and provide logging. - Make manager script that works like zopectl and has a buildout recipe that can talk to a specific RAID server. 2.0 === - Support packing? - Windows support - Make the read requests come from different backends to optimize caching and distribute IO load. - Allow adding and removing new backend servers while running.
_______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev