On Jan 21, 2008, at 6:09 AM, Dieter Maurer wrote:

"ZODB.fsIndex" tells us in its source code documentation that it splits
the 8 byte oid into a 6 byte prefix and a two byte suffix and
represents the index by an "OOBTree(prefix -> fsBucket(suffix -> position))"

It explains that is uses "fsBucket" (instead of a full tree) because
the "suffix -> position" would contain at most 256 entries.

This explanation surprises me a bit: why should the bucket contain
only 256 rather than 256 * 256 (= 64.000) entries?

You are right. The documentation is incorrect. I just fixed it on the trunk.

If the assumption is wrong (i.e. the "fsBucket" can contain up to
64.000 entries), is the implementation inefficient (because of that)?

I don't think so. This is an in-memory data structure, so there isn't a problem with a big data structure repeatedly being written to the file. Also, because new keys are normally computed sequentially, they will be added at the end of the bucket, so we won't incur the cost of shifting lots of data around.

It's possible that this is more expensive than one might want during packing, when inserts are likely to be a bit more random. It might be interesting to do a memory and cpu comparison using buckets and trees.


Jim Fulton
Zope Corporation

For more information about ZODB, see the ZODB Wiki:

ZODB-Dev mailing list  -  ZODB-Dev@zope.org

Reply via email to