Re: Using zookeeper to assign a bunch of long-running tasks to nodes (without unhandled tasks and double-handled tasks)

Qing Yan Mon, 25 Jan 2010 19:13:24 -0800

Hi Ted,

  Thank you for the detail explaination, it clarify things alot, my
understanding there two types of CONNECTION_LOSS:


1) Lose connection with one ZK server, failover to anther one successfully.
No big deal, connection to the (quorum of) ZK cluster is preserved.

2) Lose connection with the (quorum of) ZK cluster, e.g. C3 as mentioned
before. if the situation continues will lead to CONNECTION_EXPIRE.

I totally concur case 1) should be made transparent to the application, if
ZK-22 can eliminate this, it will be great.

About case 2), seems to me there is indeed a need for application to know
about this, per ZK documentation :

When you disconnect from a server (for example, when the server fails), you
will not get any watches until the connection is reestablished. For this
reason session events are sent to all outstanding watch handlers. Use
session events to go into a safe mode: you will not be receiving events
while disconnected, so your process should act conservatively in that mode.

http://hadoop.apache.org/zookeeper/docs/r3.2.2/zookeeperProgrammers.html
But from reading your post, it seems that case 2) will no longer be reported
to application, hence the confusion.
 In any case, I think the documentation needs to be updated to reflect the
latest design/contract change.

Re: Using zookeeper to assign a bunch of long-running tasks to nodes (without unhandled tasks and double-handled tasks)

Reply via email to