james strachan commented on ZOOKEEPER-63:

So this patch does not attempt to fix the race condition problem, apologies if 
I gave that impression :)

What it does do though is act as a workaround so that if a client is not able 
to properly send a disconnect packet to the server for *any reason at all* such 

* a hung socket (which can be quite common) 
* no servers available
* a race condition in the ZK client code of some kind (which we definitely have 

to not hang the client application forever - as its trying to close and shut 
down anyway :). So its a side benefit that it acts as a band aid until someone 
fixes all the possible race conditions and potential socket hangs.

Let me put it another way. Given that the client is closing; is it really 
correct to leave it potentially hanging around forever just because it cannot 
be sure if the disconnect packet was received and properly processed by the 
server? If the socket is dead before the call to close(), is it really correct 
to block until a connection can be re-established, just so it can be properly 
closed - when the code will effectively close the hung socket without sending a 
disconnect packet anyway :) ? 

The server has to detect and timeout failed sessions; whether it receives an 
explicit disconnect packet or not (as a process could just hang). So do we 
really need to be super strict on the client side, forcing clients to block 
when trying to shut down if they can't do so cleanly within some time period?

I totally agree that we should fix the race condition though :). I just wanted 
a work around to avoid my ZK test cases hanging forever due to the race 
condition :) 

> Race condition in client close() operation
> ------------------------------------------
>                 Key: ZOOKEEPER-63
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-63
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: java client
>            Reporter: Patrick Hunt
>            Assignee: Benjamin Reed
>         Attachments: patch_ZOOKEEPER-63.patch
> There is a race condition in the java close operation on ZooKeeper.java.
> Client is sending a disconnect request to the server. Server will close any 
> open connections with the client when it receives this. If the client has not 
> yet shutdown it's subthreads (event/send threads for example) these threads 
> may consider the condition an error. We see this alot in the tests where the 
> clients output error logs because they are unaware that a disconnection has 
> been requested by the client.
> Ben mentioned: perhaps we just have to change state to closed (on client) 
> before sending disconnect request.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to