ZEO has two modes for dealing with client blob data, shared, and non- 
shared.  In shared mode, a distributed file system is used to share a  
blob directory with a ZEO server.  This requires management of a  
distributed file system, in addition to the ZEO protocol.  Any caching  
is provided by the distributed file system.

In non-shared mode, blob data are downloaded to the ZEO client using  
the ZEO protocol.  No distributed file-system is needed and blob files  
are cached locally. Unfortunately, the current implementation provides  
no facilities for managing the client cache. There are no provisions  
in the ZEO client software for removing unused blob files and the blob  
implementation makes almost no provision for blob file removal.

I'm working on refactoring ClientStorage's handling of non-shared blob  
data.  I'm implementing a mechanism for periodically cleaning out  
files that haven't been accessed in a while. As part of this, I'm  
going to radically change the layout of the ClientStorage's non-shared  
blob directory.

Currently, the bushy layout, with deeply nested directories is used.  
While I think this layout makes some sense on the server, I don't  
think it makes much sense on the client.  Cleaning up unused blob  
files is complicated by the need to clean up directories too.  I'm  
going to go for a fairly flat layout.  There will be a small number  
(997) of directories and blob files will reside directly in these  
directories.  (The directory will be chosen by taking the remainder of  
dividing an oid by 997.)  It appears that modern operating systems can  
handle large directories just fine.  I've created directories with 1  
million files on Linux/Ext, Mac OS X/HFS+, and Windows XP/NTFS and saw  
no degredation in performance as the number of files in a directory  

I plan to have ClientStorage use the file layout mentioned above.  The  
ClientStorage constructor will fail if an older layout is found. An  
alternative is to just log a warning and ignore the existing  
directories, as the new directories will have non-overlapping names.

I mention this both as a heads up and to see if anyone can point out a  
problem with my approach.  I have a feeling that no one is using non- 
shared client blob directories for anything important yet, so I assume  
the change won't have much effect.

Comments are welcome, but non necessary. :)


Jim Fulton
Zope Corporation

