On 25/09/12 19:33, Daniel Kinzler wrote:
> So, can someone shed light on what DBO_TRX is intended to do, and how it is
> supposed to work?

Maybe you should have asked that before you broke it with I8c0426e1.

DBO_TRX provides the following benefits:

* It provides improved consistency of write operations for code which
is not transaction-aware, for example rollback-on-error.

* It provides a snapshot for consistent reads, which improves
application correctness when concurrent writes are occurring.

DBO_TRX was introduced when we switched over to InnoDB, along with the
introduction of Database::begin() and Database::commit().

begin() and commit() were never meant to be "matched", so it's not
surprising that you would get a lot of warnings if you started trying
to enforce that.

Initially, I set up a scheme where transactions were "nested", in the
sense that begin() incremented the transaction level and commit()
decremented it. When it was decremented to zero, an actual COMMIT was
issued. So you would have a call sequence like:

* begin() -- sends BEGIN
  * begin()  -- does nothing
  * commit() -- does nothing
* commit() -- sends COMMIT

This scheme soon proved to be inappropriate, since it turned out that
the most important thing for performance and correctness is for an
application to be able to commit the current transaction after some
particular query has completed. Database::immediateCommit() was
introduced to support this use case -- its function was to immediately
reduce the transaction level to zero and commit the underlying
transaction.

When it became obvious that that every Database::commit() call should
really be Database::immediateCommit(), I changed the semantics,
effectively renaming Database::immediateCommit() to
Database::commit(). I removed the idea of nested transactions in
favour of a model of cooperative transaction length management:

* Database::begin() became effectively a no-op for web requests and
was sometimes omitted for brevity.
* Database::commit() should be called after completion of a sequence
of write operations where atomicity is desired, or at the earliest
opportunity when contended locks are held.

In cases where transactions end up being too short due to the need for
a called function to commit a transaction when the caller already has
a transaction open, it is the responsibility of the callers to
introduce some suitable abstraction for serializing the transactions.

When transactions too long, you hit performance problems due to lock
contention. When transactions are too short, you hit consistency
problems when requests fail. The scheme I introduced favours
performance over consistency. It resolves conflicts between callers
and callees by using the shortest transaction time. I think was an
appropriate choice for Wikipedia, both then and now, and I think it is
probably appropriate for many other medium to high traffic wikis.

Savepoints were not available at the time the scheme was introduced.
But they are a refinement of the abandoned transaction nesting scheme,
not a refinement of the current scheme which is optimised for reducing
lock contention.

In terms of performance, perhaps it would be feasible to use short
transactions with an explicit begin() with savepoints for nesting. But
then you would lose the consistency benefits of DBO_TRX that I
mentioned at the start of this post.

-- Tim Starling



_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to