Having enough ram to hold your entire database may not be practical.
Ideally, you want enough to hold the working set.  For many applications,
most of the database reads are from the later part of the file.  The working
set is often much smaller than the whole file.

That is a very good point. I will try to find that out, maybe I can take a FileStorage index file and calculate the distribution.

I guess this would mean
random access (for a database like ours, in which we have many small
objects), which doesn't favor cache performance.

I don't see how this follows.

I meant that if we have to retrieve different small pickles from disk, this will result in continuous access to random disk locations, which can be bad (depending on the granularity of the cache). However, considering what you've said above (that the working set should be located at the later part of the file), maybe that's not the case.

The caches are still probably providing benefit, depending on how large they
are.  If you haven't, you should probably try using the ZEO cache-analysis
scripts to get a better handle on how effective our cache is and whether it
should be larger.

Will do so.

I imagine that someone will eventually figure out how to use
memcached to implement a shared ZEO cache, as has been done
for relstorage.

That would be great.

At PyCon, I'll be presenting work I've been doing on a load
balancer that seeks to avoid sharing the same data in multiple
caches by assigning different kinds of work to different workers.

I will be at the conference, will for sure attend :)



José Pedro Ferreira

Software Developer, Indico Project

+  '``'--- `+  CERN - European Organization for Nuclear Research
+ |CERN|  / +  1211 Geneve 23, Switzerland
+ ..__. \.  +  IT-CIS-AVC
+  \\___.\  +  Office: 513-1-005
+      /    +  Tel. +41227677159
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org

Reply via email to