On Wed, Feb 08, 2012 at 01:24:55PM +0100, Kaweh Kazemi wrote: > Recap: last week I examined problems I had packing our 4GB users > storage. With Martijn's help I was able to fix zeo's exception output > and write out the first broken pickle that throws an exception. ... > You can download the broken pickle from here: > http://www.reversepanda.com/download/brokenpickle > > If someone has more experience in parsing and understanding pickles in > regards to ZODB3, any help would be appreciated.
I don't have much experience here, but I love a puzzle >>> import pickletools >>> f = open('brokenpickle', 'rb') A ZODB record consists of two pickles: the first stores the class of the object, the other stores the state of the object >>> pickletools.dis(f) 0: c GLOBAL 'rp.odb.containers EntityMapping' 33: q BINPUT 1 35: . STOP highest protocol among opcodes = 1 >>> pickletools.dis(f) 36: } EMPTY_DICT 37: q BINPUT 2 39: U SHORT_BINSTRING 'data' 45: q BINPUT 3 47: } EMPTY_DICT 48: q BINPUT 4 50: ( MARK 51: ] EMPTY_LIST 52: q BINPUT 5 54: ( MARK 55: U SHORT_BINSTRING 'm' 58: ( MARK 59: U SHORT_BINSTRING 'game' 65: q BINPUT 6 67: U SHORT_BINSTRING '\x00\x00\x00\x00\x00\x00\tT' 77: q BINPUT 7 79: c GLOBAL 'game.objects.item Tool' 103: q BINPUT 8 105: t TUPLE (MARK at 58) 106: q BINPUT 9 108: e APPENDS (MARK at 54) 109: Q BINPERSID 110: K BININT1 1 112: ] EMPTY_LIST 113: q BINPUT 10 115: ( MARK 116: U SHORT_BINSTRING 'm' 119: ( MARK 120: h BINGET 6 122: U SHORT_BINSTRING '\x00\x00\x00\x00\x00\x00\x12\x03' 132: q BINPUT 11 134: c GLOBAL 'game.objects.item EnergyPack' 164: q BINPUT 12 166: t TUPLE (MARK at 119) 167: q BINPUT 13 169: e APPENDS (MARK at 115) 170: Q BINPERSID 171: K BININT1 1 173: u SETITEMS (MARK at 50) 174: s SETITEM 175: . STOP highest protocol among opcodes = 1 No secret calls to instantiate 'os.system' with 'rm -rf' as an argument, so I feel safe to try and unpickle it ;-) >>> import sys, pickle, pprint >>> sys.modules['rp.odb.containers'] = sys.modules['__main__'] # hack >>> sys.modules['rp.odb'] = sys.modules['__main__'] # hack >>> sys.modules['rp'] = sys.modules['__main__'] # hack >>> class EntityMapping(object): pass ... >>> f.seek(0) >>> pickle.load(f) <class '__main__.EntityMapping'> (this is a good place to do a f.tell() and remember the position -- 36 in this case -- so you can f.seek(36) as you iterate trying to make the second pickle load) >>> sys.modules['game.objects.item'] = sys.modules['__main__'] # hack >>> sys.modules['game.objects'] = sys.modules['__main__'] # hack >>> sys.modules['game'] = sys.modules['__main__'] # hack >>> class Tool(object): pass ... >>> class EnergyPack(object): pass ... >>> unp = pickle.Unpickler(f) >>> unp.persistent_load = lambda oid: '<persistent reference %r>' % oid >>> pprint.pprint(unp.load()) {'data': {"<persistent reference ['m', ('game', '\\x00\\x00\\x00\\x00\\x00\\x00\\tT', <class '__main__.Tool'>)]>": 1, "<persistent reference ['m', ('game', '\\x00\\x00\\x00\\x00\\x00\\x00\\x12\\x03', <class '__main__.EnergyPack'>)]>": 1}} Those look like cross-database references to me. The original error (aaaugh Mutt makes it hard for me to look upthread while I'm writing a response) was something about non-hashable lists? Looks like a piece of code is trying to put persistent references into a dict, which can't possibly work in all cases. See ZODB.serialize.ObjectReader._persistent_load for the canonical parser of the various possible formats. ZODB.ConflictResolution.PersistentReference.__init__ is much clearer, though perhaps a tiny bit less canonical. > During my checks I realized that running the pack in a Python 2.7 > environment (using the same ZODB version - 3.10.3) works fine, the > pack reduces our 4GB storage to 1GB. But our production server uses > Python 2.6 (same ZODB3.10.3) which yields the problem (though the test > had been done on OS X 10.7.3 - 64bit, and the production server is > Debian Squeeze 32bit). I've no idea why running the same ZODB version on Python 2.7 instead of 2.6 would make this error go away. Incidentally, since you use cross-database references, please make sure they continue to work after you pack your storage. I've lost data that way (the ZODB garbage collector doesn't see references that exist in other storages, and can assume objects are garbage when it shouldn't). Packing with GC disabled ought to be safe. Marius Gedminas -- The world is really obsessing over the UI preferences of the person who gave us git? -- Matthew Garrett
signature.asc
Description: Digital signature
_______________________________________________ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev