[Chris Withers] >> Out of interest, why are you using DirectoryStorage?
[Dario Lopez-Kästen] > I chose it for several reasons: I don't want to talk you out of it, but since this a general list I feel compelled <wink> to respond to these points wrt current FileStorage. You're using a by-now very old Zope (2.6.2), and may not be aware of the info at: http://zope.org/Wikis/ZODB/FileStorageBackup > 1) we are storing large amounts of binary files (PDF, Word, Matlab, Zip, > tar-balls, etc) in this particular application (it's a student portal, > course admin portal and an LMS). While we are not yet in the > multigigabyte realm, we are storing archive copies of all the previous > year's materials, which will eventually grow to be a lot of stuff. If I understand correctly, DirectoryStorage and FileStorage both store this stuff in giant pickles -- and then there's no cause for "large" total size difference I'm aware of. The storage comparison matrix at http://cvs.zope.org/ZODB3/Doc/storages.html?rev=1 says DirectoryStorage requires "Roughly 30% more [disk] space than Data.fs", not less disk space. Indeed, it's hard to imagine any non-compressing scheme that could require less total disk space than FileStorage. > 2) There is the issue of huge Data.fs fiels and making daily backups. We > need to have incremental backups See the link above: repozo.py supports incremental Data.fs backup, taking (using -Q) time roughly proportional to the increase in Data.fs size since the most recent backup. It goes fast! > 3) HA - while DirStor is not a HA-tool per se, it provides the necessary > tools for building something that provide some aspects of HA, ie. the > replication features, etc. Unsure what "HA" means to you. "High availability", perhaps? ZRS is available for FileStorage, but it's admittedly not free: http://www.zope.com/Products/ZRS.html > 4) Maintenance. While I have not yet dared to pack the DB, the mere size > of the database will make packing a non-trivial operation memorywise in > FielStorage. DirStor does not have the same memory requirements when > packing. The size of the objects in the database has little to do with memory consumed by a FileStorage pack; it's more the number of distinct object revisions at work, since an in-memory object reachability graph is constructed. I'm not sure how DirectoryStorage could perform packing without constructing a similar reachability graph (Toby?). The last time Jeremy and I watched a pack work on a 20GB Data.fs, on a very slow Solaris box, we noticed that it was only taking 10-20% of the RAM, and regretted the then-last round of packing changes, which favored reducing RAM usage at the cost of increasing runtime. That appears to have been a wrong tradeoff for most modern boxes. Then again, data storages are growing ever bigger too. It's very nice that DirectoryStorage's direct RAM consumption is independent of the number of objects. > 5) POSKeyErrors. We where getting quite a few of those, and that scared > me. with DirStor, I do not see them as much as before. Do you see _any_? FWIW, several nasty causes (bugs in ZODB and Zope) for POSKeyErrors have been fixed since Zope 2.6.2, and reports of POSKeyErrors from current Zope/ZODB installations are conspicuous by absence. Toby, I know (or think I know <wink>) that DirectoryStorage won't commit a transaction containing dangling references. I think that's great, and I'd like (if possible) to introduce such a check at a higher level, so that all storages would benefit. Does DirectoryStorage do something beyond that check specifically aimed at preventing POSKeyErrors? ... _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev