In this case, its up to application to decide what to do when this happens.
The application will be notified that its disconnected from the ZooKeeper
cluster. In such a case some of the applications might decide to not proceed
at all, (since it might lead to some state corruption) and some others might
decide on using cached values, wherein stale values are fine for correctness
of the system. Its up to you to decide what you would want to do in such a
Also, usually you would want to set up ZooKeeper clusters in such a way that
this should not be possible... Like across switches....
In this case, the application will be able to access one of the zookeeper
servers on the zookeeper cluster and it will be highly unlikely that they
arent able to reach any one of those.
Hope this helps.
On 4/30/10 1:26 PM, "Lei Gao" <l...@linkedin.com> wrote:
> Hi Henry,
> I am not talking about the leader election within zookeeper cluster. I guess
> I didn't make the discussion context clear. In my case, I run a cluster that
> uses zookeeper for doing the leader election. Yes, nodes in my cluster are
> the clients of zookeeper. Those nodes depend on zookeeper to elect a new
> leader and figure out what the current leader is. So if the zookeeper (think
> of it as a stand-alone entity) becomes unavailabe in the way I've described
> earlier, how can I handle such situation so my cluster can still function
> while a majority of nodes still connect to each other (but not to the
> On 4/30/10 1:10 PM, "Henry Robinson" <he...@cloudera.com> wrote:
>> Hi Lei -
>> The 'user cluster' (by which I think you mean the set of clients of
>> ZooKeeper?) plays no part in leader election. If a majority of ZooKeeper
>> server nodes can talk to each other, a new leader can be elected. Clients of
>> the minority server partition will be disconnected - if they too cannot
>> reach the majority partition then they will not be able to reconnect.
>> Hope this helps,
>> On 30 April 2010 12:45, Lei Gao <l...@linkedin.com> wrote:
>>> Hi Ted,
>>> I 100% agree with what you said. But my question is more about what if my
>>> zookeeper service cluster is partitioned from a majority of nodes in my USER
>>> CLUSTER. In this case, the majority nodes in one network partition can¹t
>>> select a new leader because zookeeper is out of reach.
>>> Another example will be that if there is an asymmetric network failure
>>> where a majority of nodes from the USER CLUSTER can¹t reach the leader while
>>> the zookeeper still can. How does zookeeper handle such situation?
>>> On 4/30/10 12:24 PM, "Ted Dunning" <ted.dunn...@gmail.com> wrote:
>>> There are a variety of situations that can trigger a new leader election
>>> and a few that can cause the cluster to be unable to elect a new leader.
>>> Isolation of just the leader is one of the situations that will cause a new
>>> leader election. Isolation of nodes into groups smaller than the quorum
>>> will result in the cluster freezing.
>>> On Fri, Apr 30, 2010 at 11:56 AM, Lei Gao <l...@linkedin.com> wrote:
>>> I have a general question on how zookeeper can maintain its view of the
>>> user cluster (that zookeeper manages) that is consistent with the nodes in
>>> the user cluster. In other words, when zookeeper considers the current
>>> leader is unavailable, does it really guarantee that a majority of nodes in
>>> the user cluster can¹t reach the current leader? The same question applies
>>> to the membership service as well. Because the zookeeper can be partitioned
>>> from a majority of the nodes in the user cluster. How does the zookeeper
>>> handle situations like this?