-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Roché Compaan wrote: > On Fri, 2008-08-22 at 16:37 -0300, Sidnei da Silva wrote: >> On Fri, Aug 22, 2008 at 9:49 AM, Roché Compaan >> <[EMAIL PROTECTED]> wrot> Transaction detail for txn #00099 >> (first document): >>> Txn id,Classname,Object count,Size (bytes) >>> #00099,BTrees._IIBTree.IIBTree,3,286 >>> #00099,OFS.Folder.Folder,1,55 >>> #00099,BTrees._IOBTree.IOBucket,9,4572 >>> #00099,BTrees._OIBTree.OIBucket,5,2964 >>> #00099,BTrees._IOBTree.IOBTree,39,17552 >>> #00099,BTrees.Length.Length,27,768 >>> #00099,Persistence.mapping.PersistentMapping,2,846 >>> #00099,Products.ATContentTypes.content.document.ATDocument,1,1544 >>> #00099,BTrees._OOBTree.OOBTree,20,3986 >>> #00099,BTrees._IIBTree.IISet,3,184 >>> #00099,BTrees._OIBTree.OIBTree,9,1404 >>> #00099,Products.Archetypes.BaseUnit.BaseUnit,3,767 >>> #00099,BTrees._OOBTree.OOBucket,2,3286 >>> #00099,BTrees._IIBTree.IITreeSet,55,3905 >>> >>> ?Transaction detail for txn #10099 (last document): >>> >>> Txn id,Classname,Object count,Size (bytes) >>> #10099,BTrees._IIBTree.IIBTree,8,2517 >>> #10099,OFS.Folder.Folder,1,55 >>> #10099,BTrees._IOBTree.IOBucket,57,81564 >>> #10099,BTrees._OIBTree.OIBucket,13,9872 >>> #10099,BTrees._IIBTree.IIBucket,29,20024 >>> #10099,BTrees._IOBTree.IOBTree,1,85 >>> #10099,Persistence.mapping.PersistentMapping,2,846 >>> #10099,BTrees.Length.Length,22,655 >>> #10099,Products.ATContentTypes.content.document.ATDocument,1,1544 >>> #10099,BTrees._OOBTree.OOBTree,6,30455 >>> #10099,BTrees._IIBTree.IISet,65,182708 >>> #10099,Products.Archetypes.BaseUnit.BaseUnit,3,767 >>> #10099,BTrees._OOBTree.OOBucket,16,8088 >>> #10099,BTrees._IIBTree.IITreeSet,2,122 >> It's pretty clear that the difference here is the IISet(65 vs 3) and >> the IOBucket(57 vs 9). The rest looks pretty much stable. Now, if I >> understand correctly that means the last document caused 57 IOBuckets >> to be modified, but not necessarily created. > > Right. But even looking at the very first transaction the indexing > overhead is visible: 3 Kbytes of data related to the document (ATDoc, > BaseUnit, PersistentMapping) is only a fraction of the total transaction > size of 40 Kbytes.
I recall a pre-Zope (for me, 10 years ago) rule of thumb that text indexing imposed an order of magnitude of overhead on the actual corpus, with improvements possible only via batching or post-processing / compresstion (incremental indexing is worst-case). Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 [EMAIL PROTECTED] Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFIryVR+gerLs4ltQ4RAv7CAKC68bT3zmp5P1xOpxCX+TpoVg/qJACcC1rv 5oQeHxjFc3iCkJz8o09awP0= =wYKj -----END PGP SIGNATURE----- _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev