Laurence Rowe <l <at> lrowe.co.uk> writes: > So why is PosgreSQL so much faster? It's using a Write-Ahead-Log for > inserts. Instead of inserting into the (B-Tree based) data files at > every transaction commit it writes a record to the WAL. This does not > require traversal of the B-Tree and has O(1) time complexity. The > penalty for this is that read operations become more complex, they must > look first in the WAL and overlay those results with the main index. The > WAL is never allowed to get too large, or its in memory index would > become too big.
This is sort of what I proposed at the performance BOF at the Plone Conf specifically for the ZCatalog. Ie we perhaps look at a catalog data structure in which writes are initially done to some kind of queue then moved to the BTrees at a later point. One thing to be vary wary of with all of this. As has been shown the conflict errors are relative to the size of the Btree, this is due to the probability of a bucket needing to be split. We need to be very sure that real life use cases are what we think they are. Ie. in a large running site some of the BTrees will already be quite large and so conflicts might not be such an issue (they are an issue so something is not right here). Or we might find for instance that one of the catalog indexes, eg. something like a FieldIndex might only have a small vocabulary (e.g. the author of a piece of content) but referenced by every document. In that case you would have an index with very few keys and N values (where N is the number of documents) and an unindex of N keys each with a a very small number of values, hence a small btree, hence large chance of collisions. -Matt _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev