What should I do to have a data structure which is memory scalable?

Consider the following large btree:

$ ./debugzope

    >>> from BTrees.OOBTree import OOBTree
    >>> root['btree']=OOBTree()
    >>> for i in xrange(700000):
    ...   root['btree'][i] = tuple(range(i,i+30))
    >>> import transaction
    >>> transaction.commit()

Quit and restart  ./debugzope

Now I just want to know if some value is in the btree:

    >>> 'value' in root['btree'].values()

or compute the length

    >>> len(root['btree'])
(I'm already using some separate lazy bookkeeping for the length, but even if len() is time consuming for a btree, it should be possible from a memory point of view)

This loads the whole btree in memory (~500MB), and that memory never gets released! If the btree grows, how will I be able to use it? (>2GB)

I've tried to scan the btree by using slices, using root['btree'].itervalues(min,max), and by trying to do some transaction.abort()/commit()/savepoint()/anything() between the slices. But every slice I parse allocates yet another amount of memory, and when the whole btree has been scanned using slices, it's like the whole btree was in memory.

I've also tried with lists, the result is the same, except the memory gets eaten even quicker.

What I understand is that the ZODB wakes up everything, and the memory allocator of python (2.4) never release the memory. Is there a solution or something I missed in the API of the ZODB or BTrees or python itself?


Zope3-users mailing list

Reply via email to