I believe Lei's concern is that the leader and all slaves can talk to ZK, but the slaves cannot talk to the leader. As a result no work can be done. However nothing will happen on the ZK side since everyone is heartbeating properly.

Mahadev I think you came up with a pretty good solution. However since the leader can see the votes from all the slaves it might just want to give up the lead itself and "pause" for a while (to give someone else the chance to be the leader). This would allow the leader to handle the case better where a single slave cannot talk to the leader, but the rest of the slaves can communicate fine.


On 04/30/2010 04:31 PM, Mahadev Konar wrote:
Maybe I jumped the gun here but Ted's response to your query is more
appropriate -

You can then use ZK in your application to pick a lead machine for other
operations.  In that case, essentially every failure scenario is handled by
the standard recipe.  In your example where the master and slave are cut
off, but both still have access to ZK, all that will happen is that the
master cannot communicate with the slave.  Both will still be clear about
who is in which role.

The case where the master is cut off from both ZK and the slave is also
handled well as is the case where the master is cut off from ZK, but not
from the slave.  In both cases, the master will get a connection loss event
and stop trying to act like a master and the slave will be notified that the
master has dropped out of its role.


On 4/30/10 4:14 PM, "Mahadev Konar"<maha...@yahoo-inc.com>  wrote:

Hi Lei,
  Sorry I minsinterpreted your question! The scenario you describe could be
handled in such a way -

You could have a status node in ZooKeeper which every slave will subscribe
to and update! If one of the slave nodes sees that there have been too many
connection refused to the Leader by the slaves, the slave could go ahead and
delete the Leader znode, and force the Leader to give up its leadership. I
am not describing a deatiled way to do it, but its not very hard to come up
with a design for this.

Do you intend to have the Leader and Slaves in different Network (different
ACLs I mean) protected zones? In that case, it is a legitimate concern else
I do think assymetric network partition would be very unlikely to happen.

Do you usually see network partitions in such scenarios?


On 4/30/10 4:05 PM, "Lei Gao"<l...@linkedin.com>  wrote:

Hi Mahadev,

Why would the leader be disconnected from ZK? ZK is fine communicating with
the leader in this case. We are talking about asymmetric network failure.
Yes. Leader could consider all the slaves being down if it tracks the status
of all slaves himself. But I guess if ZK is used for for membership
management, neither the leader nor the slaves will be considered
disconnected because they can all connect to ZK.



On 4/30/10 3:47 PM, "Mahadev Konar"<maha...@yahoo-inc.com>  wrote:

Hi Lei,

In this case, the Leader will be disconnected from ZK cluster and will give
up its leadership. Since its disconnected, ZK cluster will realize that the
Leader is dead!....

When Zk cluster realizes that the Leader is dead (this is because the zk
cluster hasn't heard from the Leader for a certain time.... Configurable via
session timeout parameter), the slaves will be notified of this via watchers
in zookeeper cluster. The slaves will realize that the Leader is gone and
will relect a new Leader and will start working with the new Leader.

Does that answer your question?

You might want to look though the documentation of ZK to understand its use
case and how it solves these kind of issues....


On 4/30/10 2:08 PM, "Lei Gao"<l...@linkedin.com>  wrote:

Thank you all for your answers. It clarifies a lot of my confusion about
service guarantees of ZK. I am still struggling with one failure case (I am
not trying to be the pain in the neck. But I need to have a full
understanding of what ZK can offer before I make a decision on whether to
used it in my cluster.)

Assume the following topology:

          Leader  ==== ZK cluster
               \\                    //
                \\                  //
                  \\               //

If I am asymmetric network failure such that the connection between Leader
and Slave(s) are broken while all other connections are still alive, would
my system hang after some point? Because no new leader election will be
initiated by slaves and the leader can't get the work to slave(s).



On 4/30/10 1:54 PM, "Ted Dunning"<ted.dunn...@gmail.com>  wrote:

If one of your user clients can no longer reach one member of the ZK
cluster, then it will try to reach another.  If it succeeds, then it will
continue without any problems as long as the ZK cluster itself is OK.

This applies for all the ZK recipes.  You will have to be a little bit
careful to handle connection loss, but that should get easier soon (and
isn't all that difficult anyway).

On Fri, Apr 30, 2010 at 1:26 PM, Lei Gao<l...@linkedin.com>  wrote:

I am not talking about the leader election within zookeeper cluster. I
I didn't make the discussion context clear. In my case, I run a cluster
uses zookeeper for doing the leader election. Yes, nodes in my cluster
the clients of zookeeper.  Those nodes depend on zookeeper to elect a new
leader and figure out what the current leader is. So if the zookeeper
of it as a stand-alone entity) becomes unavailabe in the way I've
earlier, how can I handle such situation so my cluster can still function
while a majority of nodes still connect to each other (but not to the

