On Fri, 2009-11-13 at 13:33 -0700, Shane Hathaway wrote: > Stephan Richter wrote: > > On Friday 13 November 2009, Roché Compaan wrote: > >> We had such an opportunity about 2 years ago and although the client > >> never reached (and probably will never) reach the membership they > >> dreamed about, they did pay us to develop a storage for members that > >> could scale to more than a 100 million members. We implemented a data > >> partitioning strategy at application level. If I had another shot at it, > >> I would try and develop a distributed ZODB storage, because it would be > >> a lot simpler compared to what we had to do at application level. > > > > Note that Shane developed a sharding solution a year ago with me. It > > provides > > container-level partitioning. > > > > http://svn.zope.org/z3c.sharding/trunk
Great stuff! This approaches scaling a large data set at application level though. Don't you think a ZODB storage doing this for you would solve the problem more generally? > Thanks for the reminder. :-) > > > This in combination with the encryption work that we did for the ZODB makes > > the ZODB actually be a lot more advanced than some of the new comers. > > > > I am very intrigued now to setup an EC2 cluster and install a z3c.sharding > > based solution demonstrating 100M users with some data. Mmmh... > > I've been studying how to build an enormous database based on what I > know. There are an incredible number of distributed databases these > days, but all of them concern me in one way or another. I'm wondering > if ZODB might actually have a fighting chance in the distributed > database realm. With z3c.sharding or something like it, I think I would > set things up as follows: > > - In-memory ZODB caches would probably be pointlessly painful at that > scale, so I would set the ZODB cache size for all partitions to 0. A > cache size of 0 allows ZODB to cache for the duration of a request, but > flushes all objects out of the cache at transaction boundaries. > > - With the cache size set to 0, we can disable cache invalidation, which > will probably be a major win. > > - I would rely heavily on memcached to provide the pickles. I would try > to use the cache checkpointing algorithm I recently added to RelStorage. > > - I would aim to read or write only a small number of objects per > request from partitions. I think that the master index needs to be partitioned as well. In benchmarks I performed early last year (http://bit.ly/pSVmd), a BTree could only handle about 250 inserts / second when it approached 10 million objects, so I'm guessing it will be almost unusable at a 100 million. -- Roché Compaan Upfront Systems http://www.upfrontsystems.co.za _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev