[ https://issues.apache.org/jira/browse/ZOOKEEPER-107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731776#action_12731776 ]
Raghu S commented on ZOOKEEPER-107: ----------------------------------- @henry, Sorry if this sounds like a repeat, thought I will summarize the error handling during view change. Could you comment if this makes sense? ---------- 1. Configuration change succeeds if the change is successfully committed in both the old view and the new view. An observer is promoted to a follower only after it receives a COMMIT for the new view. 2. Each peer could have two views of the cluster -- the last committed view and the last proposed view (which is created after a VIEWCHANGE proposal is received). The latter can be NULL if there is no view change attempt in progress. 2.A. Each peer will always attempt an election with the last committed view. Proposed views will be converted to committed views (or deleted) post leader election. 2.B. The proposal record of a peer contains (in addition to last logged ZXID and server ID) the last committed view of the peer 3. During election, if the last committed view of the peer with the smaller ZXID (P(ZXLOW)) is different from the last committed view of the peer with the higher ZXID (P(ZXHIGH), then P(ZXLOW) adapts P(ZXHIGH)'s last committed view and broadcasts the adapted view to all other peers. 3.A. Two nodes with the same ZXID should have the same committed views 3.B. If the last committed views of P(ZXLOW) and P(ZXHIGH) are the same, but P(ZXHIGH) has a proposed new view (not committed yet though), that view will not be considered by both the peers during election. Similarly, if the N(ZXLOW) has a proposed view, that will not be considered either. 3.C. If P(ZXLOW) adapts P(ZXHIGH)'s last committed view and that view doesn't include P(ZXLOW), P(ZXLOW) drops out of election (should it self destruct??) 4. Once a leader is elected, it will sync up the logs of the followers that are lagging behind just like it's done today: - If there is a follower who's last committed view is different from the leader's, log synchronization will make sure follower's last committed view gets updated to be in sync with the leader's. Follower doesn't do anything when its last committed view changes (the new view MUST have the follower since 3.C prevents a follower that is not in the leading candidate's committed view from successfully completing an election) - If there is an observer who upon log synchronization learns that the committed view includes the observer, the observer will promote itself to a follower - If a follower with a proposed view joins an already established leader who doesn't know about that proposed view, the follower's proposed view will be erased when the leader synchronizes the followers log - If the leader has a proposed new view in its log, the leader will send a COMMIT for the new view after majority peers in the old view and the new view have synced their log to the leader's log 4.A. The view change COMMIT doesn't mean much for the followers that are not impacted by the view change 4.B. The observer that gets view change COMMIT will promote itself to a follower if the new view includes the observer 4.C. The follower that gets the view change will drop out of the cluster if the new view doesn't include the follower 4.D. The leader will drop out of the cluster once COMMIT is delivered locally if the new view doesn't include the leader. This will result in a new election. 4.E. The leader will adjust the quorum size as per the new view otherwise. > Allow dynamic changes to server cluster membership > -------------------------------------------------- > > Key: ZOOKEEPER-107 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-107 > Project: Zookeeper > Issue Type: Improvement > Components: server > Reporter: Patrick Hunt > Assignee: Henry Robinson > Attachments: SimpleAddition.rtf > > > Currently cluster membership is statically defined, adding/removing hosts > to/from the server cluster dynamically needs to be supported. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.