This is a great FAQ topic!
There are two kinds of connection problems:
1) Disconnections: this callback says that we have disconnected:
KeeperStateDisconnected. This state is usually due to a server failure or
transient communication error that will hopefully be followed up by a
reconnected callback. The basic idea is that when disconnected from ZooKeeper
the process will not have a clear idea of changes that are happening, so it
should be conservative and assume the worst.
2) Expired session: this callback says that there was a problem, usually a
network outage, that prevented the client from keeping its session alive so the
session timed out. This state is not recoverable. This is game over a new
ZooKeeper object needs to be created the state stored in ZooKeeper needs to be
re-queried and re-setup.
Here is the best practice for handling these two states:
1) For disconnections, the server should suspend operations that relied on
information in ZooKeeper. For example, a leader should suspend operations that
assume it is a leader. Operations resume once the connection is reestablished.
2) For expired sessions, the server should relinquish any rights it received
from ZooKeeper and rerun the ZooKeeper initialization operations. For example,
a leader will need to give up leadership, create a new ZooKeeper object and
rerun the leader election protocol. Restarting the application is a very easy
way to do this.
Of course there are always exceptions to these practices. For example, given a
leader that is established with ZooKeeper and behaves conservatively by
suspending operations on disconnects, even if a process is disconnected from
ZooKeeper it could still send requests to the leader process. (A partial
network partition may cause one process to not be able to connect to ZooKeeper
and still be able to connect to another process that can connect to ZooKeeper.)
Personally, I would still write my applications to behave conservatively in
these situations since these kind of partial partitionings are difficult to
----- Original Message ----
From: Anthony Urso <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]; firstname.lastname@example.org
Sent: Thursday, July 3, 2008 7:17:32 PM
Subject: [Zookeeper-user] Recipes for dealing with disconnection and connection
Anyone have examples of the right way to deal with ZooKeeper
disconnection or connection expiration?
Currently I am exiting and starting fresh, but hopefully there is a
more efficient pattern.
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
Zookeeper-user mailing list