[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728593#action_12728593
 ] 

Henry Robinson commented on ZOOKEEPER-368:
------------------------------------------

(Reposting last bit of conversation from ZK-107, more appropriate to this jira)

_"sorry to jump in late here. rather than adding the inform, why don't we just 
send the PROPOSE and COMMIT to the Observer as normal, and just make the 
Observer not send ACKs? That way we change as little code as possible with 
minimum overhead. It also makes switching from Observer to Follower as easy as 
turning on the ACKs. I also think Observers should be able to issue proposals. 
One use case for observers are remote data centers that basically proxy clients 
that connect to ZooKeeper. This means an Observer is just a Follower that 
doesn't vote (ACK)."_

That's definitely one way to do it. The other side to that argument is to keep 
the message complexity down, especially if we can envisage use cases with lots 
of Observers. A connection to a remote Observer might be more likely to violate 
the FIFO requirement of ZK connections; having a single-message protocol makes 
it easier to deal with this case (not a correctness issue of Observers, just 
annoying if PROPOSALs arrive after COMMITs for some reason). I think that's a 
marginal issue though. My preference is for INFORM messages as this completely 
separates Observer logic from Follower logic and doesn't add much complexity to 
the code.

The Observer also has to take care not to participate in leader elections. I 
think Observers also need to announce themselves as such to the Leader, to 
enable the case where a Follower wishes to connect as an Observer temporarily 
(otherwise the Leader will think the Observer to be a Follower and use it as 
part of a quorum). Also if the leader can distinguish between followers and 
observers then it can treat both differently (e.g. through batching multiple 
INFORMs or allowing observers to lag by prioritising follower traffic).

Keeping Observers as special-case Followers would simplify the code for the 
observers patch (I've got a new version nearly ready to submit, just fixing 
some tests). However, it would mean that Observers are harder to customise - 
for example, there's no persistence requirement for an Observer and so some of 
the RequestProcessors can be optionally removed or replaced by something that 
only asynchronously writes to disk. Keeping them lightweight has been a goal. 
My feeling was that I was introducing too many 'if (amObserver()) {...}' 
branches to an already fairly hard to follow bit of code (in particular 
Follower.followLeader). Breaking the functionality into two separate classes 
seems to have made things cleaner.

Regarding Observers being able to issue proposals; I don't have a problem with 
that, should be reasonably easy to add. 



> Observers
> ---------
>
>                 Key: ZOOKEEPER-368
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: quorum
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Henry Robinson
>         Attachments: ZOOKEEPER-368.patch, ZOOKEEPER-368.patch
>
>
> Currently, all servers of an ensemble participate actively in reaching 
> agreement on the order of ZooKeeper transactions. That is, all followers 
> receive proposals, acknowledge them, and receive commit messages from the 
> leader. A leader issues commit messages once it receives acknowledgments from 
> a quorum of followers. For cross-colo operation, it would be useful to have a 
> third role: observer. Using Paxos terminology, observers are similar to 
> learners. An observer does not participate actively in the agreement step of 
> the atomic broadcast protocol. Instead, it only commits proposals that have 
> been accepted by some quorum of followers.
> One simple solution to implement observers is to have the leader forwarding 
> commit messages not only to followers but also to observers, and have 
> observers applying transactions according to the order followers agreed upon. 
> In the current implementation of the protocol, however, commit messages do 
> not carry their corresponding transaction payload because all servers 
> different from the leader are followers and followers receive such a payload 
> first through a proposal message. Just forwarding commit messages as they 
> currently are to an observer consequently is not sufficient. We have a couple 
> of options:
> 1- Include the transaction payload along in commit messages to observers;
> 2- Send proposals to observers as well.
> Number 2 is simpler to implement because it doesn't require changing the 
> protocol implementation, but it increases traffic slightly. The performance 
> impact due to such an increase might be insignificant, though.
> For scalability purposes, we may consider having followers also forwarding 
> commit messages to observers. With this option, observers can connect to 
> followers, and receive messages from followers. This choice is important to 
> avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to