We have a big(ish) zodb, which is about 29GB in size.
Thanks to the laughable difficulty of getting larger disks in big
corporates, we've been looking into what's taking up that 29GB and were
a bit surprised by the results.
Using space.py from the ZODBTools in Zope 2.9.4, it turns out that we
have a lot of PersistentMapping's:
990,359 13,555,382,871 Persistence.mapping.PersistentMapping
So, that's almost half of the 29GB!
AT's default storage is a PersistentMapping called _md so this isn't too
surprising. However, when looking into it, it turns out that half of the
PersistentMapping's actually appear to be workflow_history's from
To try and find out which objects were referencing all these workflow
histories, we tried the following starting with one of the oid of these
from ZODB.FileStorage import FileStorage
from ZODB.serialize import referencesf
fs = FileStorage(path, read_only=1)
data, serialno = fs.load(oid, '')
refs = referencesf(data)
To our surprise, all of the workflow histories returned an empty list
What does this mean? Is there a bug that means these objects are hanging
around even though there are no references? Are we using the wrong
method to find references to these objects?
(if it helps, we pack to 1 day and each pack removes between 0.5GB and
1GB from the overall size)
If there's any more info that would be helpful here, please ask away...
Simplistix - Content Management, Zope & Python Consulting
For more information about ZODB, see the ZODB Wiki:
ZODB-Dev mailing list - ZODB-Dev@zope.org