On Wed, Apr 28, 2010 at 5:11 PM, Jim Fulton <j...@zope.com> wrote: > Do you know of specific benefits you expect from protocol 2? Any > specific reasons > you think it would be better in practice?
I have just seen some ongoing work on pickles in recent times, for example from the Python 2.7 what's new: - The pickle and cPickle modules now automatically intern the strings used for attribute names, reducing memory usage of the objects resulting from unpickling. (Contributed by Jake McGuire; issue 5084.) - The cPickle module now special-cases dictionaries, nearly halving the time required to pickle them. (Contributed by Collin Winter; issue 5670.) Unless I've misread the code, these changes only apply to protocol two. And then there's the old claims of pep 307 stating that pickling new-style classes would be more efficient. Finally Python 3 introduces pickle protocol version 3, which deals explicitly with the new bytes type. There's more changes in Python 3 and the pickle format, so that's a separate project. But it suggested to me, that the pickle format isn't quite as "dead" anymore as it used to be. > I've avoided going to protocol 2 for two reasons: > > - It wasn't clear we'd get a benefit without deeper changes. > Those deeper changed might be of value, but only if we're > careful about how we make them. > > In particular, we could replace class names in pickles > if we has a registry mapping ints to class names. > This could provide a number of benefits beyond > smaller pickles, but it needs some thought to get right. Right. I'm not particular interested in the pickle class registry. Having a hard dependency between code filling the registry and the actual data has all sorts of implications. I don't really want to go there myself. > - I want zope.xmlpickle to work with ZODB database records and > it doesn't support protocol 2 yet. This doesn't have to block > moving to protocol 2, but I really would like to have this work > if possible. Ok. I know there's some tools reading the zodb data on their own, without actually using the API's. I wouldn't want to break them, if there's no clear benefit. > I'm skeptical that there would be enough benefit for protocol 2 without > implementing a registry to take advantage of integer pickle codes. > > The other benefit of protocol 2 has to do with the way instance pickles are > constructed and, for persistent objects, ZODB takes a very different > approach anyway. > > I suggest doing some realistic experiments to look at the impact of the > change. > > - Convert an interesting Zope 2 database from protocol 1 to protocol 2. > How does this affect database size? > > - Do some sort of write and read benchmarks using the 2 protocols to > see if there's a meaningful benefit. Ok, thanks. That gives me enough direction to work on some specific benchmarks. Hanno _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev