Hash: SHA1

Christian Theune wrote:
> Hi,
> one of our installations hit a scalability limits with the current blob
> directory structure.
> The current structure looks like:
> \blobs\
>     <oid>\
>         <tid>.blob
>         <tid>.blob
>         ...
>     <oid>\
>         ...
>     ...\
> We hit a limit with a database that contains more than 32k blob objects
> because ext3 doesn't allow more than 32k entries in a directory.
> We propose to introduce a new mode for the blob storage which breaks the
> directory structure into one level per byte of the oid. This would lead to
> directories 0x00-0xFF nested in 8 levels. 
> The last directory denotes the blob itself and looks like the current
> directory: a list of blob files named by the tids they were committed for.
> We propose to keep both implementations around and allow to select which one
> to use. We would extend the FileSystemHelper to abstract the two strategies.
> We would also provide a migration tool that can convert the old format to the
> new format.
> Comments?

The squid cache directories are built like this, but they use the least
significant byte first, I think to provide for better hashing (since
OIDs are linearly increasing).

You might also look at how the DirectoryStorage "formats" work:  DS
provides different strategies have different "bushiness" of the tree
based on the underlying filesystem's characteristics:


The strategy you propose sounds a lot like their "bushy" one, which fits
your use case (ext3, needs to limit number of directory entries sharply).

- --
Tres Seaver          +1 540-429-0999          [EMAIL PROTECTED]
Palladion Software   "Excellence by Design"    http://palladion.com
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org


For more information about ZODB, see the ZODB Wiki:

ZODB-Dev mailing list  -  ZODB-Dev@zope.org

Reply via email to