Qian, That's a good point, for 2) the application certainly needs to know that CONNECTIONLOSS has happened and it cannot connect to any of the servers. After spending some time on ZOOKEEPER-22, I am a little concerned about how much ZOOKEEPER-22 can do. I will post my thoughts/findings/concerns on the jira and you are welcome to give your feedback.
Thanks mahadev On 1/25/10 7:12 PM, "Qing Yan" <qing...@gmail.com> wrote: > Hi Ted, > > Thank you for the detail explaination, it clarify things alot, my > understanding there two types of CONNECTION_LOSS: > > 1) Lose connection with one ZK server, failover to anther one successfully. > No big deal, connection to the (quorum of) ZK cluster is preserved. > > 2) Lose connection with the (quorum of) ZK cluster, e.g. C3 as mentioned > before. if the situation continues will lead to CONNECTION_EXPIRE. > > I totally concur case 1) should be made transparent to the application, if > ZK-22 can eliminate this, it will be great. > > About case 2), seems to me there is indeed a need for application to know > about this, per ZK documentation : > > When you disconnect from a server (for example, when the server fails), you > will not get any watches until the connection is reestablished. For this > reason session events are sent to all outstanding watch handlers. Use > session events to go into a safe mode: you will not be receiving events > while disconnected, so your process should act conservatively in that mode. > > http://hadoop.apache.org/zookeeper/docs/r3.2.2/zookeeperProgrammers.html > But from reading your post, it seems that case 2) will no longer be reported > to application, hence the confusion. > In any case, I think the documentation needs to be updated to reflect the > latest design/contract change.