[Chris Withers, on ZEO caches] > Yeah, I saw Jeremy's wiki page about that. It seemed you guys made a lot > of progress, did anything get taken forward to a release?
The ZODB 3.2 ZEO cache got tweaked as a result. The ZODB 3.3 ZEO cache is entirely different, but is still more of a first cut than one of the advanced designs we were looking at. > ... > OK, that makes me think that maybe simul.py has just got out of sync, it > seems to have attracted less attention... I don't remember anything about the simul.py in Zope 2.7.2 (which you said you were using) -- too many releases ago. Some version of simul.py was very heavily used by Jeremy and me, but I don't recall where it lived (maybe it was even on a now-forgotten branch). simul.py should be fixed, but doing so isn't in my foreseeable plans. >> Try various sizes and judge results against whatever function you're >> trying to optimize. > Urg. simul.py was supposed to provide an alternative *schniff* If theoretical hit rate is all that matters to you, yes. That's all simul.py can do when it works. It can't model effects due to your OS file caching gimmicks, competition for RAM, competition for L1 and L2 HW memory caches, competition for disk I/O, competition for CPU cycles, competition for network bandwidth ... nothing "real world", just theoretical hit rate. Even that ignores that some objects are much bigger than others, and so also more expensive to refetch from the server. "A hit" on a 128-byte object is treated the same as "a hit" on a million-byte object, and same for "a miss". Etc. It's gross and unrealistic. Quite possibly "better than nothing", but certainly worse than _trying_ changes. >> The obvious one is more disk space required. If you use a persistent >> ZEO cache, then cache verification time at ZEO client connect/reconnect >> times may also increase proportionately. Other than those, bigger is >> probably better, and the 20MB (? whatever) default size is much smaller >> than usually desirable (it's left over from days when typical disks were >> much smaller than they are now). Try, e.g., 200MB. Like the results >> better? Iterate. >> >> Note that while the ZEO cache is disk-based, it does have in-memory >> index structures taking space proportional to the number of objects >> cached. I suppose that if the cache file were big enough to hold >> millions of objects, the RAM consumed by those indices could become >> burdensome. Haven't heard of that happening in real life, though. > OK, so what would you recommend for acheiving best "zodb speed" (ie: > don't care about disk or memory usage, unless they affect speed) on ZEO > client servers that are dual processor boxes and have one zeo client per > processor? How about 2 clients per processor? The only realistic approach is what I already suggested: change the size and measure results, on your data, your HW, your OS, your app's object access patterns, and using your idea of what "better" means. If you're serious, you also need to play with changing the target number of objects in your ZODB (Connection; in-memory; "pickle") caches. If you have enough RAM, boosting that can have a much bigger primary effect on "ZODB speed" than fiddling the ZEO cache. All object requests go to the ZODB cache first. The ZEO cache is consulted only when the ZODB cache misses. The ZODB cache also has a semi-intelligentreplacement strategy (LRU); the ZEO cache's replacement strategy (whether in 3.2 or 3.3) is more an artifact of what's reasonably easy to implement using a total of one or two disk files than it is a theoretically desirable strategy. _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev