[ https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12762713#action_12762713 ]
Christian Wiedmann commented on ZOOKEEPER-542: ---------------------------------------------- I don't really know how to do an automated test for this, since the spinning is not visible outside of the API. The manual test I used is to kill -STOP the server and then wait until the client tries to reconnect while running strace on the I/O thread (I'm using python bindings, btw). Pre-patch the strace shows repeated calls to poll, with POLLOUT set on the server fd. Post-patch, POLLOUT is not set, and there is no spinning. > c-client can spin when server unresponsive > ------------------------------------------ > > Key: ZOOKEEPER-542 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542 > Project: Zookeeper > Issue Type: Bug > Components: c client > Affects Versions: 3.2.0, 3.2.1 > Reporter: Christian Wiedmann > Assignee: Christian Wiedmann > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-542.patch, ZOOKEEPER-542.patch > > > Due to a mismatch between zookeeper_interest() and zookeeper_process(), when > the zookeeper server is unresponsive the client can spin when reconnecting to > the server. > In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is > data to be sent, but flush_send_queue() only writes the data if the state is > ZOO_CONNECTED_STATE. When in ZOO_ASSOCIATING_STATE, this results in spinning. > This probably doesn't affect production, but I had a runaway process in a > development deployment that caused performance issues on the node. This is > easy to reproduce in a single node environment by doing a kill -STOP on the > server and waiting for the session timeout. > Patch to be added. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.