Hi Raghu, Upon multiple consecutive crashes (or perhaps a network
partition), it is possible that we keep electing a faulty server if we
only use zxid. We avoid such a problem using a logical clock as
servers only consider changing their proposals if they received a
notification from the same or a later epoch. With this mechanism, if
an elected server crashes before exercising its role as a leader, it
won't be considered in later epochs. Without a logical clock, a server
lagging behind in the election could re-introduce the faulty server
into the election, and it would be elected again if the faulty server
is the one with highest zxid.
Note that we are not using "logical clocks" in the sense of Lamport
clocks. We are not incrementing upon every event, but instead only
counting rounds of leader election.
-Flavio
On Apr 13, 2009, at 8:55 PM, rag...@yahoo.com wrote:
Could someone please throw some light on this? Thanks.
-Raghu
----- Original Message ----
From: "rag...@yahoo.com" <rag...@yahoo.com>
To: zookeeper-u...@hadoop.apache.org
Sent: Friday, 10 April, 2009 8:11:34
Subject: FastLeaderElection
Hi,
Could someone please explain quickly why logical clock is used in
FastLeaderElection? It looks to me like the peers can converge on a
leader (with highest zxid or server id if zxids are the same) even
without the logical clock. May be I am missing something here, I
could not figure out why logical clock is needed.
Thanks
Raghu