>>  Hi,
>> [...snip...]
>>  a. our site is image heavy (36293 blob files) and the servers are behind a 
>> load
>>  balancer so in a single request to the web-app (a repoze.bfg site) we might 
>> even
>>  load collectively 20+ blobs from any of the 4 servers.
>>  b. zeo connection string on the clients
>>  zodb_uri =
>> zeo://xxx.xxxx.xxx.xxx:8886/?blob_dir=%(here)s/../var/blobs&shared_blob_dir=false&connection_pool_size=50&cache_size=1024MB&drop_cache_rather_verify=true
>>  c. $ cat var/blobs/.layout
>>  zeocache
>>  Any comments/suggestion on how to isolate and fix this problem would be 
>> appreciated.
> We have a number of large apps with multiple terabytes of blobs and a
> vaguely similar configuration. We haven't seen this sort of problem.
> One difference is that we set the blob cache size.  I don't suppose
> you're running of disk space?

No we aren't running out of disk space and I hadn't really thought about 
a limit to the blob cache size (I assumed there'd be a builtin default). In 
fact, I didn't know this was configurable since it isn't mentioned in the docs 
for repoze.zodbconn (which is what we use to connect to the db)[1]. Fortunately 
it appears like repoze.zodbconn just passes along the parameters it doesn't 
understand to the underlying ClientStorage connector. I shall try setting a 
limit (which instinctively seems like a good thing to do anyways to avoid 
original state and cache state from going out of sync).

> The only suggestion I have is to keep an eye on it and try to
> reporoduce the problem.

Yes, I shall attempt to do that on a dev instance of the app.

> I would think that if a request returns an
> incorrect Blob, it would continue to. If someone reports a bad blob,
> get the URL and see if you can reproduce by making the same request to
> each of the app servers, bypassing the load balencer.  If one server
> is being bad, you can remove it from the LB pool to debug it.
That's the problem. I'd assumed the same behavior. Unfortunately it appears 
incorrect blobs aren't always returned for subsequent requests. I traced down 
one such request to a single server, removed it from the LB and checked the 
URL but the request didn't give me an incorrect image. I wonder, would this 
be dependent on the connection pool or the thread serving the request?

Anyways, thanks for your help, I shall test and report back if I find a 
way to reproduce this.

- steve

[1] http://docs.repoze.org/zodbconn/narr.html#zeo-uri-scheme
