Flavio Junqueira commented on ZOOKEEPER-917:

I downloaded your logs, but the out files are empty and I couldn't find the 
notification messages. By looking at the excerpts you posted, it sounds like 
node 1 tells 0 that it is following 2 and node says that it is following (this 
is fine as node 2 might have received some old messages), so node 0 must follow 
2. Now the question is why node 1 decided to follow 2, specially because it has 
a higher zxid and the follower code should have rejected an attempt to follow a 
leader from an earlier epoch. 

It would be nice to have a look at the output of node 1. 

> Leader election selected incorrect leader
> -----------------------------------------
>                 Key: ZOOKEEPER-917
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-917
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: leaderElection, server
>    Affects Versions: 3.2.2
>         Environment: Cloudera distribution of zookeeper (patched to never 
> cache DNS entries)
> Debian lenny
>            Reporter: Alexandre Hardy
>            Priority: Critical
>             Fix For: 3.3.3, 3.4.0
>         Attachments: zklogs-20101102144159SAST.tar.gz
> We had three nodes running zookeeper:
>   *
>   *
>   *
> failed, and was replaced by a new node 
> (automated startup). The new node had not participated in any zookeeper 
> quorum previously. The node was permanently removed from 
> service and could not contribute to the quorum any further (powered off).
> DNS entries were updated for the new node to allow all the zookeeper servers 
> to find the new node.
> The new node was selected as the LEADER, despite the fact that 
> it had not seen the latest zxid.
> This particular problem has not been verified with later versions of 
> zookeeper, and no attempt has been made to reproduce this problem as yet.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to