[ https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767519#action_12767519 ]
Henry Robinson commented on ZOOKEEPER-368: ------------------------------------------ I have had to change the Leader Election code just a little bit to support Observers, and I wanted to run the decisions past everyone. Observers don't participate in Leader Elections in the sense that they don't cast votes. However, they need to learn the results. The way I do this at the moment is to force Observers always to use LeaderElection as their election algorithm (and disable vote casting for them). So essentially they simply query the rest of the ensemble for a quorum of votes. This works well, and has the advantage of not needing to teach all LE algorithms about observers. The only change I make to the rest of the code is to always start a responder thread, no matter what the prevailing election type on the follower, so that they'll always respond to the queries from observers. The correctness of this relies on the fact that a leader must always be supported by a quorum, no matter what the protocol used to elect the leader in the first place is. So it's always correct to believe that a leader that is supported by a quorum is actually the leader. Does this sound right? Are there any gotchas about always running the responder thread? Henry > Observers > --------- > > Key: ZOOKEEPER-368 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368 > Project: Zookeeper > Issue Type: New Feature > Components: quorum > Reporter: Flavio Paiva Junqueira > Assignee: Henry Robinson > Attachments: obs-refactor.patch, observer-refactor.patch, > observers.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, > ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, > ZOOKEEPER-368.patch > > > Currently, all servers of an ensemble participate actively in reaching > agreement on the order of ZooKeeper transactions. That is, all followers > receive proposals, acknowledge them, and receive commit messages from the > leader. A leader issues commit messages once it receives acknowledgments from > a quorum of followers. For cross-colo operation, it would be useful to have a > third role: observer. Using Paxos terminology, observers are similar to > learners. An observer does not participate actively in the agreement step of > the atomic broadcast protocol. Instead, it only commits proposals that have > been accepted by some quorum of followers. > One simple solution to implement observers is to have the leader forwarding > commit messages not only to followers but also to observers, and have > observers applying transactions according to the order followers agreed upon. > In the current implementation of the protocol, however, commit messages do > not carry their corresponding transaction payload because all servers > different from the leader are followers and followers receive such a payload > first through a proposal message. Just forwarding commit messages as they > currently are to an observer consequently is not sufficient. We have a couple > of options: > 1- Include the transaction payload along in commit messages to observers; > 2- Send proposals to observers as well. > Number 2 is simpler to implement because it doesn't require changing the > protocol implementation, but it increases traffic slightly. The performance > impact due to such an increase might be insignificant, though. > For scalability purposes, we may consider having followers also forwarding > commit messages to observers. With this option, observers can connect to > followers, and receive messages from followers. This choice is important to > avoid increasing the load on the leader with the number of observers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.