Hi Raghu, Upon multiple consecutive crashes (or perhaps a network partition), it is possible that we keep electing a faulty server if we only use zxid. We avoid such a problem using a logical clock as servers only consider changing their proposals if they received a notification from the same or a later epoch. With this mechanism, if an elected server crashes before exercising its role as a leader, it won't be considered in later epochs. Without a logical clock, a server lagging behind in the election could re-introduce the faulty server into the election, and it would be elected again if the faulty server is the one with highest zxid.

Note that we are not using "logical clocks" in the sense of Lamport clocks. We are not incrementing upon every event, but instead only counting rounds of leader election.


On Apr 13, 2009, at 8:55 PM, rag...@yahoo.com wrote:

Could someone please throw some light on this? Thanks.


----- Original Message ----
From: "rag...@yahoo.com" <rag...@yahoo.com>
To: zookeeper-u...@hadoop.apache.org
Sent: Friday, 10 April, 2009 8:11:34
Subject: FastLeaderElection


Could someone please explain quickly why logical clock is used in FastLeaderElection? It looks to me like the peers can converge on a leader (with highest zxid or server id if zxids are the same) even without the logical clock. May be I am missing something here, I could not figure out why logical clock is needed.


Reply via email to