On Sat, 2008-08-23 at 14:09 +0200, Dieter Maurer wrote: > Roché Compaan wrote at 2008-8-22 14:49 +0200: > >I've been doing some benchmarks on Plone and got some surprising stats > >on the pickle size of btrees and their buckets that are persisted with > >each transaction. Surprising in the sense that they are very big in > >relation to the actual data indexed. I would appreciate it if somebody > >can help me understand what is going on, or just take a look to see if > >the sizes look normal. > > > >In the benchmark I add and index 10000 ATDocuments. I commit after each > >document to simulate a transaction per request environment. Each > >document has a 100 byte long description and 100 bytes in it's body. The > >total transaction size however is 40K in the beginning. The transaction > >sizes grow linearly to about 350K when reaching 10000 documents. > > The "Bucket" nodes store usually between 22 ("OOBucket") and 90 ("IIBucket") > objects in a single bucket. > > With any change, the transaction will contain unmodified data > for several dozens other objects.
Are you saying *all* 22 OOBuckets and 90 IIBuckets will be persisted again whether they are modified or not? > > >What concerns me is that the footprint of indexed data in terms of > >BTrees, Buckets and Sets are huge! The total amount of data committed > >that related directly to ATDocument is around 30 Mbyte. The total for > >BTrees, Buckets and IISets is more than 2 Gbyte. Even taking into > >account that Plone has a lot of catalog indexes and metadata columns (I > >think 71 in total), this seems very high. > > > >This is a summary of total data committed per class: > > > >Classname,Object Count,Total Size (Kbytes) > >BTrees._IIBTree.IISet,640686,1024506 > > A typical "IISet" contains 90 value records and a persistent reference. > > I expect that an integer is pickled in 5 bytes. Thus, about 0.5 kB > should be expected as typical size of an "IISet". > Your "IISet" instances seem to be about 1.5 kB large. > > That is significantly larger than I would expect but maybe not > yet something to worry about. It looks like there is something to be worried about since there are quite a few IISet instances that are larger than 0.5 kB. Some are as large as 50K! Here are some lines from fsdump: data #00033 oid=0000000000001d65 size=50058 class=BTrees._IIBTree.IISet data #00034 oid=0000000000001d66 size=50058 class=BTrees._IIBTree.IISet data #00111 oid=0000000000001e0b size=50023 class=BTrees._IIBTree.IISet data #00033 oid=0000000000001d65 size=50063 class=BTrees._IIBTree.IISet data #00034 oid=0000000000001d66 size=50063 class=BTrees._IIBTree.IISet data #00109 oid=0000000000001e0b size=50028 class=BTrees._IIBTree.IISet data #00035 oid=0000000000001d65 size=50068 class=BTrees._IIBTree.IISet > >BTrees._IIBTree.IIBucket,252121,163524 > > The same size reasoning applies to "IIBucket"s: 90 records, but > now consisting of key and value (about 10 bytes). > > Your "IIBuckets" are smaller than one would expect. But that is supposedly ok? I am curious to know if you can explain why the proportion of actual to total transaction size is so small? -- Roché Compaan Upfront Systems http://www.upfrontsystems.co.za _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev