[ https://issues.apache.org/jira/browse/ZOOKEEPER-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932973#action_12932973 ]
Camille Fournier commented on ZOOKEEPER-922: -------------------------------------------- >From my point of view, a solution that enables faster expiration but disables >clients moving sessions to other servers is not a solution I would use. I am >not willing to take the massive hit of restarting possibly huge numbers of >sessions in the case of a single node failure. I expect the case where a >disconnect happens and the client is actually still alive to be vanishingly >rare. My clients will die all the time, my ensemble members might die >occasionally, if a switch dies, there are much bigger problems than some >overaggressive session expiration. So, Server A has a connection with the client. The switch between client and A dies and both see an error disconnect. Possible operations (in some order) after this point: A sends a ping on that session with a lower session timeout Client connects to B, which will touch the session table with the negotiated session timeout Client starts heartbeating Scenarios: 1) If A sends the ping with the lower session timeout, and the client cannot connect to B before the session expires, the session is expired and no harm no foul in my opinion. Sessions expiring due to lag on failover are a possibility that anyone using ZK should be defensively programming against. 2) Due to a lag on A's part, it did not send the timeout-lowering ping until after the client had connected to B. Client's session timeout is set lower until it heartbeats to B and B pings the leader. Or, the client might not respond to the heartbeat in this sensitive interval, causing it to have its session disconnected. This could quite possibly be solved by actually checking that a ping is coming from the current owner of a session if it is trying to set the timeout lower than the current timeout. The session tracker has the current owner stored. I wouldn't want to have to check this on every ping, but it's quite easy to add the logic back that checks if the new timeout is lower than the existing timeout, and then check to see if the pinger is the current owner. That might require code changes we don't want to do, but it seems possible. Alternatively, the session just unexpectedly times out. I'm writing defensive code against all possible failures of the ZK, so a session timeout is not a huge deal to me. 3) A pings the leader during the client connection negotiation with B. I suspect there are several possible interleavings here. I would also expect that again the worst case should be that the client sees a session expired error. This is the area to dig into more carefully. If there is an interleaving that could leave the session open forever, or cause ensemble instability, that would be a probably deal-breaker. > enable faster timeout of sessions in case of unexpected socket disconnect > ------------------------------------------------------------------------- > > Key: ZOOKEEPER-922 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-922 > Project: Zookeeper > Issue Type: Improvement > Components: server > Reporter: Camille Fournier > Assignee: Camille Fournier > Fix For: 3.4.0 > > Attachments: ZOOKEEPER-922.patch > > > In the case when a client connection is closed due to socket error instead of > the client calling close explicitly, it would be nice to enable the session > associated with that client to time out faster than the negotiated session > timeout. This would enable a zookeeper ensemble that is acting as a dynamic > discovery provider to remove ephemeral nodes for crashed clients quickly, > while allowing for a longer heartbeat-based timeout for java clients that > need to do long stop-the-world GC. > I propose doing this by setting the timeout associated with the crashed > session to "minSessionTimeout". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.