Hi Ted, Thank you for the detail explaination, it clarify things alot, my understanding there two types of CONNECTION_LOSS:
1) Lose connection with one ZK server, failover to anther one successfully. No big deal, connection to the (quorum of) ZK cluster is preserved. 2) Lose connection with the (quorum of) ZK cluster, e.g. C3 as mentioned before. if the situation continues will lead to CONNECTION_EXPIRE. I totally concur case 1) should be made transparent to the application, if ZK-22 can eliminate this, it will be great. About case 2), seems to me there is indeed a need for application to know about this, per ZK documentation : When you disconnect from a server (for example, when the server fails), you will not get any watches until the connection is reestablished. For this reason session events are sent to all outstanding watch handlers. Use session events to go into a safe mode: you will not be receiving events while disconnected, so your process should act conservatively in that mode. http://hadoop.apache.org/zookeeper/docs/r3.2.2/zookeeperProgrammers.html But from reading your post, it seems that case 2) will no longer be reported to application, hence the confusion. In any case, I think the documentation needs to be updated to reflect the latest design/contract change.