[ https://issues.apache.org/jira/browse/ZOOKEEPER-704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12865465#action_12865465 ]
Dave Wright commented on ZOOKEEPER-704: --------------------------------------- This is a great idea, but I'm afraid there is a somewhat fundamental problem with this concept. What you want is if enough nodes "go down" that a quorum can't be formed (at all), the remaining nodes go into read-only mode. The problem is that if a partition occurs (say, a single server loses contact with the rest of the cluster), but a quorum still exists, we want clients who were connected to the partitioned server to re-connect to a server in the majority. The current design allows for this by denying connections to minority nodes, forcing clients to hunt for the majority. If we allow servers in the minority to keep/accept connections, then clients will end up in read-only mode when they could have simply reconnected to the majority. It may be possible to accomplish the desired outcome with some client-side and connection protocol changes. Specifically, a flag on the connection request from the client that says "allow read-only connections" - if false, the server will close the connection, allowing the client to hunt for a server in the majority. Once a client has gone through all the servers in the list (and found out that none are in the majority) it could flip the flag to true and connect to any running servers in read-only mode. There is still the question of how to get back out of read only mode (e.g. should we keep hunting in the background for a majority, or just wait until the server we are connected to re-forms a quorum). > GSoC 2010: Read-Only Mode > ------------------------- > > Key: ZOOKEEPER-704 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-704 > Project: Zookeeper > Issue Type: Wish > Reporter: Henry Robinson > > Read-only mode > Possible Mentor > Henry Robinson (henry at apache dot org) > Requirements > Java and TCP/IP networking > Description > When a ZooKeeper server loses contact with over half of the other servers in > an ensemble ('loses a quorum'), it stops responding to client requests > because it cannot guarantee that writes will get processed correctly. For > some applications, it would be beneficial if a server still responded to read > requests when the quorum is lost, but caused an error condition when a write > request was attempted. > This project would implement a 'read-only' mode for ZooKeeper servers (maybe > only for Observers) that allowed read requests to be served as long as the > client can contact a server. > This is a great project for getting really hands-on with the internals of > ZooKeeper - you must be comfortable with Java and networking otherwise you'll > have a hard time coming up to speed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.