Max M wrote:

Nguyen Quan Son wrote:
> Hi,
> I have a problem with performance and memory consumption when trying to do some statistics, using following code:
> ...
> docs = container.portal_catalog(meta_type='Document', ...)
> for doc in docs:
> obj = doc.getObject()
> value = obj.attr
> ...
> With about 10.000 documents this Python script takes 10 minutes and more than 500MB of memory, after that I had to restart Zope. I
> am running Zope 2.6.1 + Plone 1.0 on Windows 2000, Xeon P4 with 1GB RAM.
> What's wrong with this code? Any suggestion is appreciated.
> Nguyen Quan Son.

Most likely you are filling the memory of your server so that you are swapping to disk.

Try cutting the query into smaller pieces so that the memory doesn't get filled up.

If you can't use catalog metadata as Seb suggests (eg. you are actually accessing many attributes, large values, etc.) and if indeeed memory is the problem (which seems likely) then you can ghostify the objects that were ghosts to begin with, and it will save memory (unless all those objects are already in cache).

The problem with this strategy though is that doc.getObject() method used in your code activates the object and hence you won't know if it was a ghost already or not. To get around this you can shortcut this method and do something like :

docs = container.portal_catalog(meta_type='Document', ...)
for doc in docs:
    obj = doc.aq_parent.unrestrictedTraverse(doc.getPath())
    was_ghost = obj._p_changed is None
    value = obj.attr
    if was_ghost:obj._p_deactivate()

You can test this by running your code on a freshly restarted server, and check the number of objects in cache. The number shouldn't change much after running the above method, but will increase dramatically if you just used 'obj = doc.getObject()' instead, or didn't do the deactivating of the objects. The lower number of objects in your cache should in turn keep your memory usage down, and prevent your computer paging through the request, and hence speed things up considerably!

Another option would be to reduce the size of your cache so that the amount of memory your zope instance consumes doesn't cause your computer to swap, though doing the above code changes will also help keep your cache with the 'right' objects in it as well, which in turn will further help with the performance of subsequent requests.



Zope-Dev maillist - [EMAIL PROTECTED]
** No cross posts or HTML encoding! **
(Related lists - )

Reply via email to