[ https://issues.apache.org/jira/browse/ZOOKEEPER-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888122#action_12888122 ]
Vishal K commented on ZOOKEEPER-790: ------------------------------------ >From ZOOKEEPER-335.. Hi, I enabled tracing and did some more debugging. Looks like the restarted peer (and trying to join the cluster) determines that it is a leader and increments its epoch. However, rest of the nodes don't acknowledge this node as the leader, and hence, have an older epoch. I will attache the log. Unfortunately, I don't have traces from other nodes. I will repeat the experiment later and attache logs from other nodes. Scenario: * Form a 3 node cluster. This is not just ZK cluster. It also involves our application cluster that uses ZK. * Kill one of the follower * After a minute or so restart follower * Follower rejects leader with "Leader epoch y is less than our epoch y + 1" >From logs: a) Peer X restarts and starts leader election. a) For a small window of time, X thinks that it is the new leader! During this window, for some reason, rest of the nodes tell X that they are also trying to find a leader. I.e., all 3 nodes are in LOOKING state. After seeing that all 3 nodes are in LOOKING state, X decides to be a leader? 155 2010-06-20 23:22:46,421 - DEBUG [WorkerSender Thread:quorumcnxmana...@346] - Opening channel to server 1 156 2010-06-20 23:22:46,423 - DEBUG [WorkerReceiver Thread:fastleaderelection$messenger$workerrecei...@214] - Receive new notification message. My id = 0 157 2010-06-20 23:22:46,424 - INFO [QuorumPeer:/0.0.0.0:2181:fastleaderelect...@689] - Notification: 0, 77309411393, 1, 0, LOOKING, LOOKING, 0 158 2010-06-20 23:22:46,424 - DEBUG [QuorumPeer:/0.0.0.0:2181:fastleaderelect...@495] - id: 0, proposed id: 0, zxid: 77309411393, proposed zxid: 77309411393 159 2010-06-20 23:22:46,424 - DEBUG [QuorumPeer:/0.0.0.0:2181:fastleaderelect...@717] - Adding vote: From = 0, Proposed leader = 0, Porposed zxid = 77309411393, Proposed epoch = 1 160 2010-06-20 23:22:46,426 - INFO [WorkerSender Thread:quorumcnxmana...@162] - Have smaller server identifier, so dropping the connection: (1, 0) 161 2010-06-20 23:22:46,426 - DEBUG [WorkerSender Thread:quorumcnxmana...@346] - Opening channel to server 2 162 2010-06-20 23:22:46,427 - DEBUG [Thread-1:quorumcnxmanager$liste...@445] - Connection request /192.168.1.182:46701 163 2010-06-20 23:22:46,427 - DEBUG [Thread-1:quorumcnxmanager$liste...@448] - Connection request: 0 164 2010-06-20 23:22:46,428 - DEBUG [Thread-1:quorumcnxmanager$sendwor...@504] - Address of remote peer: 1 165 2010-06-20 23:22:46,428 - INFO [WorkerSender Thread:quorumcnxmana...@162] - Have smaller server identifier, so dropping the connection: (2, 0) 166 2010-06-20 23:22:46,431 - DEBUG [WorkerReceiver Thread:fastleaderelection$messenger$workerrecei...@214] - Receive new notification message. My id = 0 167 2010-06-20 23:22:46,432 - INFO [QuorumPeer:/0.0.0.0:2181:fastleaderelect...@689] - Notification: 1, 77309411372, 1, 0, LOOKING, LOOKING, 1 168 2010-06-20 23:22:46,432 - DEBUG [QuorumPeer:/0.0.0.0:2181:fastleaderelect...@495] - id: 1, proposed id: 0, zxid: 77309411372, proposed zxid: 77309411393 169 2010-06-20 23:22:46,432 - DEBUG [QuorumPeer:/0.0.0.0:2181:fastleaderelect...@717] - Adding vote: From = 1, Proposed leader = 1, Porposed zxid = 77309411372, Proposed epoch = 1 170 2010-06-20 23:22:46,436 - DEBUG [Thread-1:quorumcnxmanager$liste...@445] - Connection request /192.168.1.183:44310 171 2010-06-20 23:22:46,436 - DEBUG [Thread-1:quorumcnxmanager$liste...@448] - Connection request: 0 172 2010-06-20 23:22:46,436 - DEBUG [Thread-1:quorumcnxmanager$sendwor...@504] - Address of remote peer: 2 173 2010-06-20 23:22:46,440 - DEBUG [WorkerReceiver Thread:fastleaderelection$messenger$workerrecei...@214] - Receive new notification message. My id = 0 174 2010-06-20 23:22:46,440 - INFO [QuorumPeer:/0.0.0.0:2181:fastleaderelect...@689] - Notification: 2, 73014444097, 1, 0, LOOKING, LOOKING, 2 175 2010-06-20 23:22:46,440 - DEBUG [QuorumPeer:/0.0.0.0:2181:fastleaderelect...@495] - id: 2, proposed id: 0, zxid: 73014444097, proposed zxid: 77309411393 176 2010-06-20 23:22:46,441 - DEBUG [QuorumPeer:/0.0.0.0:2181:fastleaderelect...@717] - Adding vote: From = 2, Proposed leader = 2, Porposed zxid = 73014444097, Proposed epoch = 1 177 2010-06-20 23:22:46,441 - INFO [QuorumPeer:/0.0.0.0:2181:quorump...@647] - LEADING b) As a result X increments its epoch. Worse, since this node decided to be a leader, it starts doing transactions. The first set of transactions start removing all ephemeral nodes. But these transactions are only done locally. Other peers do not ack these transactions since they know that this peer is not the leader. c) After a few seconds (8 secs), X relinquishes leadership since it does not receive any ack from rest of the peers d) It starts leader election again. This time, rests of the peers respond correctly informing X that a leader already exists. e) X determines that it should now follow the leader and starts handshake with the leader. However, X has incremented its epoch in the mean time. X thinks that the epoch sent by the leader during handshake is older than the epoch it has noticed before. X aborts connection with the leader. > Last processed zxid set prematurely while establishing leadership > ----------------------------------------------------------------- > > Key: ZOOKEEPER-790 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-790 > Project: Zookeeper > Issue Type: Bug > Components: quorum > Affects Versions: 3.3.1 > Reporter: Flavio Paiva Junqueira > Assignee: Flavio Paiva Junqueira > Priority: Blocker > Fix For: 3.3.2, 3.4.0 > > Attachments: ZOOKEEPER-790.patch > > > The leader code is setting the last processed zxid to the first of the new > epoch even before connecting to a quorum of followers. Because the leader > code sets this value before connecting to a quorum of followers > (Leader.java:281) and the follower code throws an IOException > (Follower.java:73) if the leader epoch is smaller, we have that when the > false leader drops leadership and becomes a follower, it finds a smaller > epoch and kills itself. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.