Chris Darroch updated ZOOKEEPER-320:
Attachment: (was: ZOOKEEPER-320-319.patch)
> call auth completion in free_completions()
> Key: ZOOKEEPER-320
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
> Project: Zookeeper
> Issue Type: Bug
> Components: c client
> Affects Versions: 3.0.0, 3.0.1, 3.1.0
> Reporter: Chris Darroch
> Fix For: 3.1.1, 3.2.0
> Attachments: ZOOKEEPER-320-319.patch, ZOOKEEPER-320.patch
> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the
> ZooKeeper server will mark their session expired and close the connection.
> However, the C client has returned immediately after queuing the new auth
> data to be sent with a ZOK return code.
> If the client then waits for their auth completion function to be called,
> they can wait forever, as no session event is ever delivered to that
> completion function. All other completion functions are notified of session
> events by free_completions(), which is called by cleanup_bufs() in
> handle_error() in handle_socket_error_msg().
> In actual fact, what can happen (about 50% of the time, for me) is that the
> next call by the IO thread to flush_send_queue() calls send() from within
> send_buffer(), and receives a SIGPIPE signal during this send() call.
> Because the ZooKeeper C API is a library, it properly does not catch that
> signal. If the user's code is not catching that signal either, they
> experience an abort caused by an untrapped signal. If they are ignoring the
> signal -- which is common in context I'm working in, the Apache httpd server
> -- then flush_send_queue()'s error return code is EPIPE, which is logged by
> handle_socket_error_msg(), and all non-auth completion functions are notified
> of a session event. However, if the caller is waiting for their auth
> completion function, they wait forever while the IO thread tries repeatedly
> to reconnect and is rejected by the server as having an expired session.
> So, first of all, it would be useful to document in the C API portion of the
> programmer's guide that trapping or ignoring SIGPIPE is important, as this
> signal may be generated by the C API.
> Next, the two attached patches call the auth completion function, if any, in
> free_completions(), which fixes this problem for me. The second attached
> patch includes auth lock/unlock function, as per ZOOKEEPER-319.
This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.