On 19 Feb 2014, at 10:44 pm, Jim Fulton <j...@zope.com> wrote:

> On Tue, Feb 18, 2014 at 8:59 PM, Dylan Jay <d...@pretaweb.com> wrote:
>> Hi,
>> 
>> I'm seeing a a ZCatalog reindex of a large number of objects take a long 
>> time while only using 10% cpu. I'm not sure yet if this is due to the size 
>> of the objects and therefore the network is saturated, or the ZEO file reads 
>> aren't fast enough.
> 
> How heavily loaded is your storage server, especially %CPU of the
> server process?

no not heavily loaded.

> 
> Also, are the ZODB object or client caches big enough for the job?

I'm not sure the caches would ever be big enough since it's iterating over 1.7M 
objects.

> 
>> However looking at the protocol I didn't see a way for code such as the 
>> ZCatalog to give a hint to ZEO as to what they wanted to load next so the 
>> time is taken by network delays rather than either ZEO or app. Is that the 
>> case?
> 
> It is the case that a ZEO client does one read at a time and that
> there's no easy way to pre-load objects.
> 
>> I'm guessing if it is, it's a fundamental design problem that can't be fixed 
>> :(
> 
> I don't think there's a *fundamental* problem.  There are three
> issues. The hardest to solve
> isn't at the storage level. I'll mention the 2 easiest problems first:
> 
> 1. The ZEO client implementation only allows one outstanding request at a 
> time,
>    even on a client with multiple threads.  This is merely a clumsy
> implementation.
> 
>    The protocol easily allows for multiple outstanding reads!
> 
> 2. The storage API doesn't provide a way to read multiple objects at once, or 
> to
>    otherwise hint that additional objects will be loaded.
> 
> Both of these are fairly straightforward to fix. It's just a matter of time. 
> :)
> 
> 3. You have to be able to predict what data are going to be needed.
> 
>    This IMO is rather hard, at least at a general level. It's what's left
>    me somewhat under-motivated to address the first 2 problems.
> 
> We really should address problems 1 and 2 to make it possible
> for people to experiment with approaches to problem 3.

yeah I figured it might be the case thats its hard to predict. In this case 
it's catalog indexing so I was wondering if something could be done with 
__iter__ on a btree? It's a reasonably good guess that you could start 
preloading more of those objects if the first few are loaded? 

> 
> Jim
> 
> -- 
> Jim Fulton
> http://www.linkedin.com/in/jimfulton

_______________________________________________
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev

Reply via email to