[ https://issues.apache.org/jira/browse/ZOOKEEPER-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sergey Doroshenko updated ZOOKEEPER-790: ---------------------------------------- Attachment: ZOOKEEPER-790-follower-request-NPE.log It seems this patch introduces a bug: followers get synchronized with leader and start serving clients before leader startups its own zk instance. As the result, when a follower forward request to the leader, for example session revalidation request, it fails because leader's sessionTracker is null. Previously this was not the case because leader started its zk immediately after quorum peer is switched to LEADING state. See attached log. > Last processed zxid set prematurely while establishing leadership > ----------------------------------------------------------------- > > Key: ZOOKEEPER-790 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-790 > Project: Zookeeper > Issue Type: Bug > Components: quorum > Affects Versions: 3.3.1 > Reporter: Flavio Junqueira > Assignee: Flavio Junqueira > Priority: Blocker > Fix For: 3.3.2, 3.4.0 > > Attachments: ZOOKEEPER-790-3.3.patch, ZOOKEEPER-790-3.3.patch, > ZOOKEEPER-790-follower-request-NPE.log, ZOOKEEPER-790.patch, > ZOOKEEPER-790.patch, ZOOKEEPER-790.patch, ZOOKEEPER-790.patch, > ZOOKEEPER-790.patch, ZOOKEEPER-790.travis.log.bz2 > > > The leader code is setting the last processed zxid to the first of the new > epoch even before connecting to a quorum of followers. Because the leader > code sets this value before connecting to a quorum of followers > (Leader.java:281) and the follower code throws an IOException > (Follower.java:73) if the leader epoch is smaller, we have that when the > false leader drops leadership and becomes a follower, it finds a smaller > epoch and kills itself. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.