[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12865465#action_12865465
 ] 

Dave Wright commented on ZOOKEEPER-704:
---------------------------------------

This is a great idea, but I'm afraid there is a somewhat fundamental problem 
with this concept. 
What you want is if enough nodes "go down" that a quorum can't be formed (at 
all), the remaining nodes go into read-only mode.

The problem is that if a partition occurs (say, a single server loses contact 
with the rest of the cluster), but a quorum still exists, we want clients who 
were connected to the partitioned server to re-connect to a server in the 
majority. The current design allows for this by denying connections to minority 
nodes, forcing clients to hunt for the majority. If we allow servers in the 
minority to keep/accept connections, then clients will end up in read-only mode 
when they could have simply reconnected to the majority.

It may be possible to accomplish the desired outcome with some client-side and 
connection protocol changes. Specifically, a flag on the connection request 
from the client that says "allow read-only connections" - if false, the server 
will close the connection, allowing the client to hunt for a server in the 
majority. Once a client has gone through all the servers in the list (and found 
out that none are in the majority) it could flip the flag to true and connect 
to any running servers in read-only mode. There is still the question of how to 
get back out of read only mode (e.g. should we keep hunting in the background 
for a majority, or just wait until the server we are connected to re-forms a 
quorum).

> GSoC 2010: Read-Only Mode
> -------------------------
>
>                 Key: ZOOKEEPER-704
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-704
>             Project: Zookeeper
>          Issue Type: Wish
>            Reporter: Henry Robinson
>
> Read-only mode
> Possible Mentor
> Henry Robinson (henry at apache dot org)
> Requirements
> Java and TCP/IP networking
> Description
> When a ZooKeeper server loses contact with over half of the other servers in 
> an ensemble ('loses a quorum'), it stops responding to client requests 
> because it cannot guarantee that writes will get processed correctly. For 
> some applications, it would be beneficial if a server still responded to read 
> requests when the quorum is lost, but caused an error condition when a write 
> request was attempted.
> This project would implement a 'read-only' mode for ZooKeeper servers (maybe 
> only for Observers) that allowed read requests to be served as long as the 
> client can contact a server.
> This is a great project for getting really hands-on with the internals of 
> ZooKeeper - you must be comfortable with Java and networking otherwise you'll 
> have a hard time coming up to speed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to