On Thursday 14 April 2005 20:23, Tim Peters wrote: > The size of the objects in the database has little to do with memory > consumed by a FileStorage pack; it's more the number of distinct object > revisions at work, since an in-memory object reachability graph is > constructed. I'm not sure how DirectoryStorage could perform packing > without constructing a similar reachability graph (Toby?).
Both storages *traverse* the object reachability graph, keeping a record of which oids are reachable. They both keep a traversal to-do list in memory, which is sized proportional to the height of the reachability graph. They differ in how they record which oids are reachable. FileStorage uses an fsIndex instance, which stores everything in memory (in a memory-efficient manner). The default implementation in DirectoryStorages uses a bit in the file permissions to mark reached objects. The I/O cost of this is the main reason for DirectoryStorage's relative slowness in packing. There is an alternative implementation in DirectoryStorage which creates a second temporary ZODB to hold an OIBTree to store the list of reachable objects. This also has a fixed memory cost and performs better than the standard permissions bit implementation. One big disadvantage last time I looked was memory leaks when creating and destroying ZODB.DB objects - but I think Tim and Jeremy have since addressed that. > The last time Jeremy and I watched a pack work on a 20GB Data.fs, on a very > slow Solaris box, we noticed that it was only taking 10-20% of the RAM, and > regretted the then-last round of packing changes, which favored reducing RAM > usage at the cost of increasing runtime. That appears to have been a wrong > tradeoff for most modern boxes. Interesting. DirectoryStorage can use an all-in-memory implementation too. Anyone with a big storage fancy trying it? > Toby, I know (or think I know <wink>) that DirectoryStorage won't commit a > transaction containing dangling references. I think that's great, and I'd > like (if possible) to introduce such a check at a higher level, so that all > storages would benefit. There are races in this dangling reference detection. I guess thats OK since it is only there to warn about a bug in a higher layer. > Does DirectoryStorage do something beyond that > check specifically aimed at preventing POSKeyErrors? There are numerous corner cases that can lead to objects incorrectly appearing to be unreachable during packing. I describe one here: http://mail.zope.org/pipermail/zodb-dev/2002-May/002601.html DirectoryStorage takes two precautions to reduce the chances of being bitten by this class of problem: a. Ensuring that the pack threshold time leaves sufficient margin of safety. storage.pack(one day ago) is fine. storage.pack(zero days ago) is silently converted to storage.pack(10 minutes ago) b. Both storages keep all objects that are reachable from a sufficiently recent version of the root object. DirectoryStorage will also keep objects that have been modified in any sufficiently recent transaction even if they do not appear to be reachable. (this set in almost always empty, unless we have hit a corner case. Objects almost always have to be reachable in order to get modified) -- Toby Dickenson _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev