On Mon, Nov 9, 2009 at 12:11 PM, Christian Theune <c...@gocept.com> wrote: > Hi, > > On 11/09/2009 05:01 PM, Jim Fulton wrote: >> On Mon, Nov 9, 2009 at 9:25 AM, Christian Theune <c...@gocept.com> wrote: >> ... >>> Reading the code talking to tpc_transaction I found that this seems to >>> be merely an optimization (which I can disable by just letting >>> tpc_transaction return None all the time). >> >> No, it is used to decide if the underlying storage is committing. > > Right. It's the non-blocking version of doing tpc_begin then.
No. It's what I said it is. :) >From the client's perspective, the vote call will block. tpc_begin never blocks for a client because we don't enter two-phase commit until we have all of the data. > >>> Why is the waiting list necessary? >> >> To avoid blocking the server waiting for an underlying storage's commit lock. > > Hmm. Ah - blocking the server would result in load calls from other > clients to be blocked although they could be served at that point in time. In the current implementation, it would result in all calls blocking. > >>> And why does it work alright in a ZEO >>> fan-out scenario? >> >> Why wouldn't it? > > I'll try to explain what I see: > > Assume three ZEO servers ZEO, ZEO1, and ZEO2. ZEO1 and ZEO2 are clients > for ZEO. Also, assume three Zope servers Z1a, Z1b and Z2. Z1a/b talk to > ZEO1 and Z2 talks to ZEO2. > > The interaction that I see is this: > > - Z1a calls ClientStorage.tpc_begin() which locally causes > tpc_transaction() to start returning a non-None value > blocking other tpc_begin calls from this Zope server from now on. The > StorageServer also has a safe-guard against this. I don't know what you're talking about. tpc_transaction is only used on the server. ClientStorage only commits one transaction at a time (on that client). This has nothing to do with the dance on the server that *does* use tpc_transaction. (In some future implementation, we can and should allow multiple transactions from the same client at the same time.) > - ClientStorage then causes ZEOStorage.tpc_begin() on ZEO1 to be called > which prepares the ZEOStorage to prepare the commit log. Nothing is > seen on the storage behind yet. Yup. > - Z1a calls ClientStorage.store() and pushes data into the commit log. > Those initial steps can happen from multiple ZEO clients in parallel, > but only once per client. Yes. > > - Z1a calls ClientStorage.vote() which causes ZEO1's ZEOStorage.vote() > to be called which in turn calls _wait() which again calls _restart() > finally causing the underlying storage's tpc_begin() to be called and > replaying the commit log of ZEO1 into the upstream ZEO until the > commit log is done and calls the upstream vote() which causes > tpc_begin() on the final storage to be called. > > At this point, ZEO2 doesn't know about the ongoing transaction in the > upstream ZEO, but ZEO1 does. I don't know what you mean. > > Z1a will not be able to issue another commit, those are blocked locally > by ClientStorage but pure reading transactions will go through. > > Z1b will be fine because ZEO1 knows about the ongoing commit and puts > Z1b into the waiting list when trying to vote, allowing other reads from > that connection to go through. > > However, when Z2 tries to commit, it starts filling the commit log on > ZEO2. ZEO2 doesn't know about the ongoing commit on the upstream ZEO and > will allow the vote phase to go upstream. However, know the commit from > Z2 gets stuck because it is put in the waiting list on the upstream ZEO > while ZEO2 thinks it was able to proceed. Right, the vote call from ZEO2 blocks until the server is able to complete the in process transaction. > This will cause Z2 to completely become stuck and not benefit from the > waiting list on ZEO2. > > Sorry for the bloated example, I think it's the smallest way to explain it. > > Am I misunderstanding how the waiting list works? No, but exactly the same thing would happen without fan out. Consider a simpler example. There are 2 ZEO clients, C1 and C2, of a server, S. C1 calls tpc_vote on S (after calling tpc_begin and making some store calls.) It's vote call proceeds because there weren't other transaction committing on S. Now C2 calls tpc_vote. It will block, until C1 calls tpc_finish or tpc_abort. It is just as blocked as Z2 is in your fan-out example above. Jim -- Jim Fulton _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev