We're evaluating AWS for some of our applications and I'm thinking of adding
some options to support using S3 to store Blobs:
1. Allow a storage in a ZEO storage server to store Blobs in S3.
This would probably be through some sort of abstraction to make
this not actually depend on S3. It would likely leverage the fact that
a storage server's interaction with blobs is more limited than application
2. Extend blob objects to provide an optional URL to fetch data
from. This would allow applications to provide S3 (or similar service)
URLs for blobs, rather than serving blob data themselves.
2.1 If I did this I think I'd also add a blob size property, so you could
get a blob's size without opening the blob file or downloading
it from a database server.
Option 3. Handle blob URLs at the application level.
To make this work for the S3 case, I think we'd have to use a
ZEO server connection to be called by application code. Something like:
self.blob = ZODB.blob.Blob()
f = self.blob.open('w')
Option 1 is fairly straightforward, and low risk.
Option 2 is much trickier:
- It's an API change
- There are bits of implementation that depend on the
current blob record format. I'm not sure if these
bits extend beyond the ZODB code base.
- The handling of blob object state would be a little
delicate, since some of the state would be set on the storage
- The win depends on being able to load a blob
file independently of loading blob objects, although
the ZEO blob cache implementation already depends
For more information about ZODB, see the ZODB Wiki:
ZODB-Dev mailing list - ZODB-Dev@zope.org