We tackled the problem within CPS couple of months ago by taking
advantages of the ZODB before commit hooks. The idea is to define an
Indexation Manager registred as a before commit hook that will filter
and store all the indexation calls on CMF objects and then wait for the
end of the transaction (actually just *before* the end of the
transaction) to do the actual indexation. Like that, we got atomic
indexation whatever is happening during the transaction, for a given
object. The actual reindexObject() and reindexObjectSecutiry() calls are
actually redirected to the indexation manager that is queuing the call
with the parameters.

You can check the code there :

We needed to extend the ZODB API to deal with subscriber orders. Note,
an endless discussion occurred in the ZODB list about this... Anyway,
you'll find this there :

We are using the same idea for the tree cache updates (because we don't
 store navigation trees within the catalog <wink>

As well for the events notifications :

Feel free to ask questions on the cps-devel lists if you got any.

Enjoy !


Alec Mitchell wrote:
> So, Sidnei has been plugging away at the "AT reindexes things an obscene 
> number of times" issue today, and appears to have fixed many of the AT 
> triggered indexing redundancies.  There are however still a few places in 
> CMF where some cataloging redundancy might be avoided.  One obvious place is 
> during object creation, where the following happens:
> *) TypesTool.constructInstance() is triggered
>     **) A _setObject call results in CMFCatalogAware.manage_afterAdd() which 
> triggers a full indexObject().
>     *) This is shortly followed by TypesTool._finishConstruction()
>         *) Which calls CMFCatalogAware.notifyWorkflowCreated()
>             *) Which in turn calls WorkFlowTool._reindexWorkflowVariables()
>                 **) Which does a CMFCatalogAware.reindexObject([idxs]) on 
> workflow specific variables (with a full metadata update)
>                 *) And calls CMFCatalogAware.reindexObjectSecurity() which 
> reindexes the object only on the security index, and doesn't touch metadata.
>         **) TypesTool._finishConstruction() then does another 
> CMFCatalogAware.reindexObject().
> So we have two full reindexes, and three metadata updates.  The last reindex 
> appears to be there only to catch the change to 'portal_type' in 
> _finishConstruction.  So, this final reindexObject, might safely be changed 
> to reindexObject(['portal_type', 'Type']), though the possibility exists 
> that other indexed attributes added by 3rd parties may depend on the value 
> of portal_type (say, I use an autogenerated Title which includes the Type).  
> Additionally, almost immediately before this last reindexObject call, 
> another reindexObject call has happened in notifyWorkflowCreated, which 
> included a full catalog metadata update.  As a result, updating the catalog 
> metadata here is certainly redundant.  Unfortunately, the 
> CMFCatalogAware.reindexObject method provides no means of avoiding the 
> duplicate metadata update, though it would be trivial to add and to use 
> here.
> Another option suggested by Sidnei on IRC, which would avoid the potential 
> issues with limiting the variables indexed in the final reindex.  Would be 
> to let CMFCatalogAware.manage_afterAdd know (presumably via some state 
> variable) that it is being invoked through constructInstance/invokeFactory, 
> in which case it could safely skip the initial indexing and allow 
> _finishConstruction to take care of indexing the object fully on it's own at 
> the end.  In the long term we will probably be better served by delaying all 
> indexing to transaction boundaries, though it will be a fair bit harder to 
> implement, and may irk some developers who depend on immediate changes to 
> the catalog on reindex.
> Alec
