[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Darroch updated ZOOKEEPER-262:
------------------------------------

    Attachment: zookeeper-close.patch

> unnecesssarily complex reentrant zookeeper_close() logic
> --------------------------------------------------------
>
>                 Key: ZOOKEEPER-262
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-262
>             Project: Zookeeper
>          Issue Type: Improvement
>          Components: c client
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0, 3.2.0, 4.0.0
>            Reporter: Chris Darroch
>            Priority: Minor
>             Fix For: 3.1.0, 3.2.0, 4.0.0
>
>         Attachments: zookeeper-close.patch
>
>
> While working on a wrapper for the C API I puzzled over the problem of how to 
> determine when the multi-threaded adaptor's IO and completion threads had 
> exited.  Looking at the code in api_epilog() and adaptor_finish() it seemed 
> clear that any thread could be the "last one out the door", and whichever was 
> last would "turn out the lights" by calling zookeeper_close().
> However, on further examination I found that in fact, the close_requested 
> flag guards entry to zookeeper_close() in api_epilog(), and close_requested 
> can only be set non-zero within zookeeper_close().   Thus, only the user's 
> main thread can invoke zookeeper_close() and kick off the shutdown process.  
> When that happens, zookeeper_close() then invokes adaptor_finish() and 
> returns ZOK immediately afterward.
> Since adaptor_finish() is only called in this one context, it means all the 
> code in that function to check pthread_self() and call pthread_detach() if 
> the current thread is the IO or completion thread is redundant.  The 
> adaptor_finish() function always signals and then waits to join with the IO 
> and completion threads because it can only be called by the user's main 
> thread.
> After joining with the two internal threads, adaptor_finish() calls 
> api_epilog(), which might seem like a trivial final action.  However, this is 
> actually where all the work gets done, because in this one case, api_epilog() 
> sees a non-zero close_requested flag value and invokes zookeeper_close().  
> Note that zookeeper_close() is already on the stack; this is a re-entrant 
> invocation.
> This time around, zookeeper_close() skips the call to adaptor_finish() -- 
> assuming the reference count has been properly decremented to zero! -- and 
> does the actual final cleanup steps, including deallocating the zh structure. 
>  Fortunately, none of the callers on the stack (api_epilog(), 
> adaptor_finish(), and the first zookeeper_close()) touches zh after this.
> This all works OK, and in particular, the fact that I can be certain that the 
> IO and completion threads have exited after zookeeper_close() returns is 
> great.  So too is the fact that those threads can't invoke zookeeper_close() 
> without my knowing about it.
> However, the actual mechanics of the shutdown seem unnecessarily complex.  
> I'd be worried a bit about a new maintainer looking at adaptor_finish() and 
> reasonably concluding that it can be called by any thread, including the IO 
> and completion ones.  Or thinking that the zh handle can still be used after 
> that innocuous-looking call to adaptor_finish() in zookeeper_close() -- the 
> one that actually causes all the work to be done and the handle to be 
> deallocated!
> I'll attach a patch which I think simplifies the code a bit and makes the 
> shutdown mechanics a little more clear, and might prevent unintentional 
> errors in the future.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to