[ https://issues.apache.org/jira/browse/ZOOKEEPER-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12828708#action_12828708 ]
Hadoop QA commented on ZOOKEEPER-569: ------------------------------------- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12434553/zookeeper-569.patch against trunk revision 903483. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/61/console This message is automatically generated. > Failure of elected leader can lead to never-ending leader election > ------------------------------------------------------------------ > > Key: ZOOKEEPER-569 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-569 > Project: Zookeeper > Issue Type: Bug > Reporter: Henry Robinson > Assignee: Henry Robinson > Attachments: zookeeper-569.patch > > > It is possible for basic LeaderElection to enter a situation where it never > terminates. > As an example, consider a three node cluster A, B and C. > 1. In the first round, A votes for A, B votes for B and C votes for C > 2. Since C > B > A, all nodes resolve to vote for C in the second round as > there is no first round winner > 3. A, B vote for C, but C fails. > 4. C is not elected because neither A nor B hear from it, and so votes for it > are discarded > 5. A and B never reset their votes, despite not hearing from C, so continue > to vote for it ad infinitum. > Step 5 is the bug. If A and B reset their votes to themselves in the case > where the heard-from vote set is empty, leader election will continue. > I do not know if this affects running ZK clusters, as it is possible that the > out-of-band failure detection protocols may cause leader election to be > restarted anyhow, but I've certainly seen this in tests. > I have a trivial patch which fixes it, but it needs a test (and tests for > race conditions are hard to write!) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.