I've been debugging session problems for two days, I feel it's time to write down what I've observed and ask for other eyes to look at it (Chris McDonough has been working on this too). This is all on Zope 2.9 trunk BTW (ZODB 3.6.0b5 and Zope 2.9's tempstorage) with python 2.4.2.

What I observed was an unnatural number of repeated ConflictError (by that, I mean "write" conflicts) followed by more and more ReadConflictErrors as soon as you go beyond the time CONFLICT_CACHE_MAXAGE of TemporaryStorage.

To simplify debugging, I've boosted that constant and I only debug the write conflict errors.

The first write conflict happens when a BTree can't resolve a conflict. The transaction is then aborted.

Here, it should happen what happens correctly for FileStorage, the connections' _flush_invalidations should get called and it shoud reset the _txn_time of the connection to None so that the modified oids (including the BTree's), when invalidated, reset the _txn_time to their serial. So that on the next conflict, _setstate_noncurrent calls loadBefore with that serial.

But apparently the _flush_invalidations() of the connection is never called. So _txn_time is never bumped into the future (and in turn, means the next write conflict will try to load exactly the same serials as before and fail again, etc.) .

This seems to happen because:

1. the connection has _synch to True: it has registered itself has a synchronizer, and expects its afterCompletion to be called when (among others) the transaction is aborted, and the afterCompletion is calling _flush_invalidations,

2. the synchronizer (the connection itself) has been lost from the transaction's _serializers WeakSet for some reason (garbage collected I guess). It was there in earlier transactions, but it's not there at the time it's needed.

If someone can make sense of this...

Actually I don't know why the connection (=synchronizer) could be gone from the transaction's _sychronizers WeakSet but still be in the DB's connection pool WeakSet. I guess here lies the problem.

Also, I don't know why we don't observe this for FileStorage, maybe something has a hard reference on it somewhere?

Florent

--
Florent Guillaume, Nuxeo (Paris, France)   Director of R&D
+33 1 40 33 71 59   http://nuxeo.com   [EMAIL PROTECTED]



_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Reply via email to