On 7 July 2011 15:18, Jim Fulton <j...@zope.com> wrote: > On Thu, Jul 7, 2011 at 9:13 AM, Laurence Rowe <l...@lrowe.co.uk> wrote: >> On 6 July 2011 19:44, Jim Fulton <j...@zope.com> wrote: > > ... > >> Adding the ability to store blobs in S3 would be an excellent feature >> for AWS based deployments. I'm not convinced that presenting S3 urls >> to the end users is terribly useful as there is no ability to set a >> Content-Disposition header and the url will not end with the correct >> file extension, which will cause problems for users downloading files. > > My lack of S3 foo is showing. > >> I would imagine a more common setup would be to serve the S3 stored >> blobs through a proxy server running in EC2, using something similar >> to Nginx's X-Accel-Redirect. Lovely Systems has some information on >> generating an S3 Authrorization header in Nginx here: >> http://www.lovelysystems.com/nginx-as-an-amazon-s3-authentication-proxy-2/ >> - though generating an authenticated S3 URL in Python to set in the >> X-Accel-Redirect header would lead to much simpler proxy >> configuration. > > I'll have to do some more digging and get back. > >> In either case though, I don't see why doing so would necessitate >> changing the blob record format - presumably a blob's url can be >> simply mapped from the S3 blobstorage configuration and a blob's oid >> and tid? > > You're probably right. That's a much better approach.
One thing I found with my (rather naive) experiments building s3storage a few years ago is that you need to ensure requests to S3 are made in parallel to get reasonable performance. This would be a lesser problem with blobs, but even then you might have multiple file uploads in the same request. The boto library is really useful, but doesn't support async requests. I guess the simplest implementation would only upload a blob to S3 in tpc_begin as that is where the tid is set (and presumably the tid will form part of the blob's S3 url.) With large files that might make tpc_begin take a long time to complete as it waits for the blob data to be loaded into S3. It might be better to upload large blobs to a temporary s3 url first and then only make an S3 copy in tpc_begin, you'd need to do some benchmarks to see if this was worthwhile for all files or only files over a certain size. Laurence _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev