[Dieter Maurer]
The newest pickle formats can also handle the class references is bit more efficiently -- at least when a single transaction modifies many objects of the same class.
[Chris Withers]
I know ZC was involved in the work to introduce these new pickle formats, but are they actually used in ZODB yet?
[Dieter]
I think the optimization I refered to (if you have several instances of the same class in a pickle, then they all share a single instance of the class name and reference this one) is already used.
I'm not sure what you have in mind there. It's always been true that pickle was _able_ to reuse common bits, but this is effectively disabled in ZODB on a cross-persistent-object basis by (from a recent serialize.py): def _dump(self, classmeta, state): # To reuse the existing cStringIO object, we must reset # the file position to 0 and truncate the file after the # new pickle is written. self._file.seek(0) self._p.clear_memo() self._p.dump(classmeta) self._p.dump(state) self._file.truncate() return self._file.getvalue() The "self._p.clear_memo()" there makes the pickler forget everything it's done, so that the pickle for a persistent object is self-contained. For example, if you store an OOBTree whose internal state contans 100 OOBuckets, the string "BTrees._OOBucket" appears 100 times in the data record, and string "OOBTree" even more. Jeremy once analyzed a customer Data.fs and incidentally discovered that about half the space was consumed by repetitions of such BTree-related strings; no idea whether that's typical, although I wouldn't be surprised if it were. An entirely new gimmick was introduced in pickle protocol 2, the "extension registry" described in PEP 307: http://www.python.org/dev/peps/pep-0307/ That _allows_ an application to register "popular" module and class string names that pickles can reference later via teensy 2- or 3-byte (independent of string length) opcodes. In effect, such strings are stored in the _implementation_ of pickle instead of inside pickles. AFAIK, nobody anywhere has used this yet, outside of Python's test suite. It was intended to be a simple, cheap approach to cutting pickle bloat for apps motivated enough to set up the registry. You'll note that half the one-byte codes are reserved for Zope :-)
You mean an optimization to make the pickle size for some new style classes smaller. That's not yet used because it could make the storage exchange between different Python versions impossible (the older Python versions would not understand the new pickle protocol).
The need for protocol 2 is also why the extension registry (above) can't be used so long as older Pythons are in the mix. _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev