On Thu, Jan 30, 2014 at 9:58 AM,  <jason.mad...@nextthought.com> wrote:
> Hello ZODB dev,
>
> I was recently trying to GC a large multi-database setup for the first time 
> using zc.zodbdgc. The process wouldn't complete (or really even get started) 
> because of an IndexError being thrown from `zc.zodbdgc.getrefs` (__init__.py 
> line 287). As I traced through it, it began to look like the combination of 
> `cPickle.Unpickler.noload` and multi-database persistent ids (which in ZODB 
> are list objects) fails, generating an empty list instead of the expected 
> [ref type, args] list documented in `ZODB.serialize`. This makes it 
> impossible to correctly GC a multi-database.
>
> I was curious if anyone else had seen this

I haven't. I'm the author of zodbdgc and I use it regularly, including on
large (for some definition) databases.

>, or maybe I'm just doing something wrong? We solved our problem by using 
>`load` instead of `noload`, but I wondered if there might be a better way?
>
> Details:
>
> I'm working under Python 2.7.6 and 2.7.3 with ZODB 4.0.0, zc.zodbdgc 0.6.1 
> and eventually zodbpickle 0.5.2. Most of my results were repeated on both Mac 
> OS X and Linux.

Why are you using zodbpickle?  Perhaps that is behaving differently
from cPickle in some
fashion?


> After hitting the IndexError, I began debugging the problem. When it became 
> clear that the persistent_load callback was simply getting the wrong 
> persistent ids passed to it (empty lists instead of complete multi-db refs), 
> I tried swapping in zodbpickle for the stock cPickle to the same effect. 
> Here's some code demonstrating the problem:
>
>
> This pickle data came right out of ZODB, captured during a debug session of 
> zc.zodbdgc. It has three persistent ids, two cross database and one in the 
> same database:
>
>     >>> p = 
> 'cBTrees.OOBTree\nOOBTree\nq\x01.((((X\x0c\x00\x00\x00Users_1_Prodq\x02]q\x03(U\x01m(U\x0cUsers_1_Prodq\x04U\x08\x00\x00\x00\x00\x00\x00\x00\x01q\x05czope.site.folder\nFolder\nq\x06tq\x07eQX\x0c\x00\x00\x00Users_2_Prodq\x08]q\t(U\x01m(U\x0cUsers_2_Prodq\nU\x08\x00\x00\x00\x00\x00\x00\x00\x01q\x0bh\x06tq\x0ceQX\x0b\x00\x00\x00dataserver2q\r(U\x08\x00\x00\x00\x00\x00\x00\x00\x10q\x0eh\x06tQttttq\x0f.'
>
> This code is copy-and-pasted out of zc.zodbgc getrefs. It's supposed to find 
> all the persistent refs and put them inside the `refs` list:
>
>     >>> import cPickle
>     >>> import cStringIO
>     >>> refs = []
>     >>> u = cPickle.Unpickler(cStringIO.StringIO(p))
>     >>> u.persistent_load = refs
>     >>> u.noload()
>     >>> u.noload()
>
> But if we look at `refs`, we see that the first two cross-database refs are 
> returned as empty lists, not the correct value:
>
>     >>> refs
>     [[], [], ('\x00\x00\x00\x00\x00\x00\x00\x10', None)]
>
> If instead we use `load` to read the state, we get the correct references:
>
>     >>> refs = []
>     >>> u = cPickle.Unpickler(cStringIO.StringIO(p))
>     >>> u.persistent_load = refs
>     >>> u.noload()
>     >>> u.load()
>
>     >>> refs
>     [['m', ('Users_1_Prod', '\x00\x00\x00\x00\x00\x00\x00\x01', <class 
> 'zope.site.folder.Folder'>)],
>      ['m', ('Users_2_Prod', '\x00\x00\x00\x00\x00\x00\x00\x01', <class 
> 'zope.site.folder.Folder'>)],
>      ('\x00\x00\x00\x00\x00\x00\x00\x10', <class 'zope.site.folder.Folder'>)]
>
> The results are the same using zodbpickle or using an actual callback 
> function instead of the append-directly-to-list shortcut.
>
> If we fix the IndexError by checking the size of the list first, we miss all 
> the cross-db references, meaning that a GC is going to be too aggressive. But 
> using `load` is slower and requires access to all of the classes referenced. 
> If anyone has run into this before or has other suggestions, I'd appreciate 
> hearing them.

I'd try using ZODB 3.10.  I suspect a ZODB 4 incompatibility of some sort.

Unfortunately, I don't have time to dig into this now.

This weekend, I'll at least see if I can make zodbdgc tests pass with ZODB 4.
Perhaps that will shed light.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
_______________________________________________
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev

Reply via email to