BTW one other observation; when I use 3 clients in the same JVM (i.e. 3 separate instances of ZooKeeper to try simulate a set of different processes) I find that each client receives an initial WatchEvent on startup; then from that point on, only the first 2 clients receive further watch events for the connection starting/stopping, despite me closing the server down, waiting a while, restarting the server then stopping it again etc.
I'm wondering if this is related to why the 3rd client seems to kinda lock up; that its loosing connection watch events. There's nothing hard coded somewhere that only allows 2 ZooKeeper clients per JVM or anything is there? :) I'm gonna have a look around and see if there's any nasty static variables around or something... We could maybe do with some more tests for multiple clients with failover etc. Anyone else seen something like this? 2008/7/23 James Strachan <[EMAIL PROTECTED]>: > 2008/7/22 Flavio Junqueira <[EMAIL PROTECTED]>: >> James, I'd like to clarify what exactly is the issue you're looking at. If >> you provide a list of ZooKeeper servers, then a client will try to reconnect >> to another ZooKeeper server upon a disconnection. Reconnecting to another >> server does not guarantee maintaining the same session, though. So, are you >> trying to guarantee that the session is still the same upon a reconnection? >> If so, I don't think you can do it by just changing the client, since the >> servers might have expired the old session. > > I'm trying to test the WriteLock implementation in the case where the > server dies and the client reconnects to another server. > In the test case I'm just running one server, killing it, restarting > it and trying to get the client to reconnect. > > The test case is WriteLockTest in this patch... > https://issues.apache.org/jira/browse/ZOOKEEPER-78 > > (unfortunately its not been committed yet so I can't easily point you > at the code). Its very easy to run the test with different numbers of > clients and see lockups at various places. > > The bizarre thing I've seen is that things do reconnect mostly fine > (apart from the SessionExpiredException issue in one of the clients) > https://issues.apache.org/jira/browse/ZOOKEEPER-84 > > but a lockup often happens when trying to close down the ZooKeeper instance. > > When running the test case with 3 independent clients and one server; > I tend to see the last client having a session expired and its often > the one that locks up; but when running the test with more clients I > see more lockups elsewhere. > > I just wondered if folks had seen similar lockups when you try > restarting ZK servers? > > (I'm testing on OS X; this lockup could be timing related maybe). > > -- > James > ------- > http://macstrac.blogspot.com/ > > Open Source Integration > http://open.iona.com > -- James ------- http://macstrac.blogspot.com/ Open Source Integration http://open.iona.com