[ https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728593#action_12728593 ]
Henry Robinson commented on ZOOKEEPER-368: ------------------------------------------ (Reposting last bit of conversation from ZK-107, more appropriate to this jira) _"sorry to jump in late here. rather than adding the inform, why don't we just send the PROPOSE and COMMIT to the Observer as normal, and just make the Observer not send ACKs? That way we change as little code as possible with minimum overhead. It also makes switching from Observer to Follower as easy as turning on the ACKs. I also think Observers should be able to issue proposals. One use case for observers are remote data centers that basically proxy clients that connect to ZooKeeper. This means an Observer is just a Follower that doesn't vote (ACK)."_ That's definitely one way to do it. The other side to that argument is to keep the message complexity down, especially if we can envisage use cases with lots of Observers. A connection to a remote Observer might be more likely to violate the FIFO requirement of ZK connections; having a single-message protocol makes it easier to deal with this case (not a correctness issue of Observers, just annoying if PROPOSALs arrive after COMMITs for some reason). I think that's a marginal issue though. My preference is for INFORM messages as this completely separates Observer logic from Follower logic and doesn't add much complexity to the code. The Observer also has to take care not to participate in leader elections. I think Observers also need to announce themselves as such to the Leader, to enable the case where a Follower wishes to connect as an Observer temporarily (otherwise the Leader will think the Observer to be a Follower and use it as part of a quorum). Also if the leader can distinguish between followers and observers then it can treat both differently (e.g. through batching multiple INFORMs or allowing observers to lag by prioritising follower traffic). Keeping Observers as special-case Followers would simplify the code for the observers patch (I've got a new version nearly ready to submit, just fixing some tests). However, it would mean that Observers are harder to customise - for example, there's no persistence requirement for an Observer and so some of the RequestProcessors can be optionally removed or replaced by something that only asynchronously writes to disk. Keeping them lightweight has been a goal. My feeling was that I was introducing too many 'if (amObserver()) {...}' branches to an already fairly hard to follow bit of code (in particular Follower.followLeader). Breaking the functionality into two separate classes seems to have made things cleaner. Regarding Observers being able to issue proposals; I don't have a problem with that, should be reasonably easy to add. > Observers > --------- > > Key: ZOOKEEPER-368 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368 > Project: Zookeeper > Issue Type: New Feature > Components: quorum > Reporter: Flavio Paiva Junqueira > Assignee: Henry Robinson > Attachments: ZOOKEEPER-368.patch, ZOOKEEPER-368.patch > > > Currently, all servers of an ensemble participate actively in reaching > agreement on the order of ZooKeeper transactions. That is, all followers > receive proposals, acknowledge them, and receive commit messages from the > leader. A leader issues commit messages once it receives acknowledgments from > a quorum of followers. For cross-colo operation, it would be useful to have a > third role: observer. Using Paxos terminology, observers are similar to > learners. An observer does not participate actively in the agreement step of > the atomic broadcast protocol. Instead, it only commits proposals that have > been accepted by some quorum of followers. > One simple solution to implement observers is to have the leader forwarding > commit messages not only to followers but also to observers, and have > observers applying transactions according to the order followers agreed upon. > In the current implementation of the protocol, however, commit messages do > not carry their corresponding transaction payload because all servers > different from the leader are followers and followers receive such a payload > first through a proposal message. Just forwarding commit messages as they > currently are to an observer consequently is not sufficient. We have a couple > of options: > 1- Include the transaction payload along in commit messages to observers; > 2- Send proposals to observers as well. > Number 2 is simpler to implement because it doesn't require changing the > protocol implementation, but it increases traffic slightly. The performance > impact due to such an increase might be insignificant, though. > For scalability purposes, we may consider having followers also forwarding > commit messages to observers. With this option, observers can connect to > followers, and receive messages from followers. This choice is important to > avoid increasing the load on the leader with the number of observers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.