oops, sorry camille, i didn't mean to replicate your answer. you
explained it better than me :)
On 11/18/2010 10:06 AM, Fournier, Camille F. [Tech] wrote:
This is exactly the scenario that you use to test session expiration, make one
connection to a ZK and then another with the same session and password, and
close the second connection, which causes the first to expire. It is only a
clean close that will cause this to happen, though (one where the client calls
close to end the connection).
Right now, if you have a partition between client and server A, I would not
expect server A to see a clean close from the client, but one of the various
exceptions that cause the socket to close. These do not do anything currently
to change the state of the session, and if the client connects elsewhere before
the session timeout, the session will remain active.
From: Gustavo Niemeyer [mailto:gust...@niemeyer.net]
Sent: Thursday, November 18, 2010 10:16 AM
To: ZooKeeper Users
Subject: How to reestablish a session
As some of you already know, we've been using ZooKeeper at Canonical
for a project we've been pushing (Ensemble, http://j.mp/dql6Fu).
We've already written down txzookeeper (http://j.mp/d3Zx7z), to
integrate the Python bindings with Twisted, and we're also in the
process of creating a Go binding for the C ZooKeeper library (to be
Yesterday, while working on the Go bindings, a test made me wonder
about what's the correct way to reestablish a session with ZooKeeper.
In another thread a couple of months ago, Ben mentioned:
i'm a bit skeptical that this is going to work out properly. a server may
receive a socket reset even though the client is still alive:
1) client sends a request to a server
2) client is partitioned from the server
3) server starts trying to send response
4) client reconnects to a different server
5) partition heals
6) server gets a reset from client
at step 6 i don't think you want to delete the ephemeral nodes.
I also don't think it should delete ephemeral nodes. While performing
some tests, though, I noticed that something similar to this may
The following sequence was performed in the test:
1) Establish connection A to ZK
2) Create an ephemeral node with A
3) Establish connection B to ZK, reusing the session from A
4) Close connection A
5) The ephemeral node from (2) got deleted.
So, this made me wonder about what's the proper way to reestablish a
session in practice, due to partitioning. Imagine that the
reconnection which happened on (3) was an attempt from the client to
restore the communication with the ZK cluster when faced with
partitioning. Once the connection succeeded, the old resources from
connection A should be disposed, but how to do this without risking
killing the healthy connection on B (imagine that the network comes
back between (3) and (4)).
Anyone has thoughts on that?