This is how it happens -

  Leader will issue a COMMIT only when a quorum of servers have logged the
transaction to disk. The issuing of COMMIT just means that all the servers
can make the transaction visible to the client. The COMMIT message is never

On leader crash, another machine with max zxid will be elected. Since a
quorum have logged the above transaction to disk, the new leader will have
that transaction on disk and will let the other members of the quorum know
of the transaction in case they havent logged it. This way the transaction
is always remembered if a client has seen that transaction go through the
zookeeper service.

You can read more about this in one of the internals presentations at:


On 1/26/10 6:20 PM, "Qing Yan" <qing...@gmail.com> wrote:

> Hi,
> I have question about how zookeeper *remembers* a commit operation.
> According to
> http://hadoop.apache.org/zookeeper/docs/r3.2.2/zookeeperInternals.html#sc_summ
> ary
> <quote>
> The leader will issue a COMMIT to all followers as soon as a quorum of
> followers have ACKed a message. Since messages are ACKed in order, COMMITs
> will be sent by the leader as received by the followers in order.
> COMMITs are processed in order. Followers deliver a proposals message when
> that proposal is committed.
> </quote>
> My question is will leader wait for COMMIT to be processed by quorum
> of followers before consider
> COMMIT to be success? From the documentation it seems that leader handles
> COMMIT asynchronously and
> don't expect confirmation from followers. In the extreme case, what happens
> if leader issue a COMMIT
> to all followers and crash immediately before the COMMIT message can go out
> of the network. How the system
> remembers the COMMIT ever happens?
> Actually this is related to the leader election process:
> <quote>
> ZooKeeper messaging doesn't care about the exact method of electing a leader
> has long as the following holds:
>    -
>    The leader has seen the highest zxid of all the followers.
>    -
>    A quorum of servers have committed to following the leader.
>  Of these two requirements only the first, the highest zxid amoung the
> followers needs to hold for correct operation.
> </quote>
> Is there a liveness issue try to find "The leader has seen the highest zxid
> of all the followers"? What if some of the followers (which happens to
> holding the highest zxid) cannot be contacted(FLP impossible result?)
>  It will be more striaghtforward if COMMIT requires confirmation from a
> quorum of the followers. But I guess things get
> optimized according to Zab's FIFO nature...just want to hear some
> clarification about it.
> Thanks alot!

Reply via email to