On Mon, Nov 9, 2009 at 12:11 PM, Christian Theune <c...@gocept.com> wrote:
> On 11/09/2009 05:01 PM, Jim Fulton wrote:
>> On Mon, Nov 9, 2009 at 9:25 AM, Christian Theune <c...@gocept.com> wrote:
>>> Reading the code talking to tpc_transaction I found that this seems to
>>> be merely an optimization (which I can disable by just letting
>>> tpc_transaction return None all the time).
>> No, it is used to decide if the underlying storage is committing.
> Right. It's the non-blocking version of doing tpc_begin then.
No. It's what I said it is. :)
>From the client's perspective, the vote call will block. tpc_begin
never blocks for a client because we don't enter two-phase commit
until we have all of the data.
>>> Why is the waiting list necessary?
>> To avoid blocking the server waiting for an underlying storage's commit lock.
> Hmm. Ah - blocking the server would result in load calls from other
> clients to be blocked although they could be served at that point in time.
In the current implementation, it would result in all calls blocking.
>>> And why does it work alright in a ZEO
>>> fan-out scenario?
>> Why wouldn't it?
> I'll try to explain what I see:
> Assume three ZEO servers ZEO, ZEO1, and ZEO2. ZEO1 and ZEO2 are clients
> for ZEO. Also, assume three Zope servers Z1a, Z1b and Z2. Z1a/b talk to
> ZEO1 and Z2 talks to ZEO2.
> The interaction that I see is this:
> - Z1a calls ClientStorage.tpc_begin() which locally causes
> tpc_transaction() to start returning a non-None value
> blocking other tpc_begin calls from this Zope server from now on. The
> StorageServer also has a safe-guard against this.
I don't know what you're talking about. tpc_transaction is only used
on the server.
ClientStorage only commits one transaction at a time (on that client). This has
nothing to do with the dance on the server that *does* use tpc_transaction.
(In some future implementation, we can and should allow multiple transactions
from the same client at the same time.)
> - ClientStorage then causes ZEOStorage.tpc_begin() on ZEO1 to be called
> which prepares the ZEOStorage to prepare the commit log. Nothing is
> seen on the storage behind yet.
> - Z1a calls ClientStorage.store() and pushes data into the commit log.
> Those initial steps can happen from multiple ZEO clients in parallel,
> but only once per client.
> - Z1a calls ClientStorage.vote() which causes ZEO1's ZEOStorage.vote()
> to be called which in turn calls _wait() which again calls _restart()
> finally causing the underlying storage's tpc_begin() to be called and
> replaying the commit log of ZEO1 into the upstream ZEO until the
> commit log is done and calls the upstream vote() which causes
> tpc_begin() on the final storage to be called.
> At this point, ZEO2 doesn't know about the ongoing transaction in the
> upstream ZEO, but ZEO1 does.
I don't know what you mean.
> Z1a will not be able to issue another commit, those are blocked locally
> by ClientStorage but pure reading transactions will go through.
> Z1b will be fine because ZEO1 knows about the ongoing commit and puts
> Z1b into the waiting list when trying to vote, allowing other reads from
> that connection to go through.
> However, when Z2 tries to commit, it starts filling the commit log on
> ZEO2. ZEO2 doesn't know about the ongoing commit on the upstream ZEO and
> will allow the vote phase to go upstream. However, know the commit from
> Z2 gets stuck because it is put in the waiting list on the upstream ZEO
> while ZEO2 thinks it was able to proceed.
Right, the vote call from ZEO2 blocks until the server is able to complete
the in process transaction.
> This will cause Z2 to completely become stuck and not benefit from the
> waiting list on ZEO2.
> Sorry for the bloated example, I think it's the smallest way to explain it.
> Am I misunderstanding how the waiting list works?
No, but exactly the same thing would happen without fan out.
Consider a simpler example. There are 2 ZEO clients, C1 and C2, of a server, S.
C1 calls tpc_vote on S (after calling tpc_begin and making some store
calls.) It's vote call
proceeds because there weren't other transaction committing on S.
Now C2 calls tpc_vote. It will block, until C1 calls tpc_finish or
tpc_abort. It is just as
blocked as Z2 is in your fan-out example above.
For more information about ZODB, see the ZODB Wiki:
ZODB-Dev mailing list - ZODB-Dev@zope.org