Flavio Paiva Junqueira commented on ZOOKEEPER-684:

I agree with your observation: "Thread 1 has received 2 votes for server 2 as 
the leader. It then exits, and this is the problem, I think. As a result, 
Thread 0 can never get a quorum." And, my interpretation is that it happens 
because server 1 is timing out before receiving the vote of server 2 in round 
1. Server 1 then receives in the second round the vote of mock server 2 and the 
vote of server 0 (also supporting 2), which cause server 1 to leave prematurely.

I also don't think your patch works because 
"peer.getElectionAlg().lookForLeader()" won't return until the election is 
over. That method is not called for each round.

> Race in LENonTerminateTest
> --------------------------
>                 Key: ZOOKEEPER-684
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-684
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: leaderElection, server
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Henry Robinson
>            Priority: Critical
>             Fix For: 3.3.0
>         Attachments: zookeeper-684-test-failure.rtf, ZOOKEEPER-684.patch
> testNonTermination failed during a Hudson run for ZOOKEEPER-59. After 
> inspecting the output, it looks like server is electing 2 as a leader and 
> leaving. Given that 2 is just a mock server, server 0 remains alone in leader 
> election.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to