Kapil Thangavelu commented on ZOOKEEPER-763:
The issue with the example i sent is that when the condition notify happens in
the python callback, the main process thread can start running before the
callback has exited and the completion thread will still be running. It could
probably make be made more explicit for reproducing by inserting a
time.sleep(1) line into the callback after the notify.
This is the stack trace for the completion thread on deadlock.
#0 0x00cb8422 in __kernel_vsyscall ()
#1 0x00387245 in sem_wait@@GLIBC_2.1 () from /lib/tls/i686/cmov/libpthread.so.0
#2 0x0810abe8 in PyThread_acquire_lock ()
#3 0x080dcc11 in PyEval_EvalFrameEx ()
#4 0x080e2807 in PyEval_EvalCodeEx ()
#5 0x080e0c8b in PyEval_EvalFrameEx ()
#6 0x080e1bb0 in PyEval_EvalFrameEx ()
#7 0x080e2807 in PyEval_EvalCodeEx ()
#8 0x0816b2ac in ?? ()
#9 0x0806245a in PyObject_Call ()
#10 0x080db892 in PyEval_CallObjectWithKeywords ()
#11 0x080624f0 in PyObject_CallObject ()
#12 0x00a8dd95 in data_completion_dispatch (rc=0, value=0xa01dcd8
"8\334\001\n\370\263N", value_len=0, stat=0xb773828c, data=0xa0035d0) at
#13 0x00f33ecc in process_completions (zh=0xa024ab0) at src/zookeeper.c:1778
#14 0x00f4005b in do_completion (v=0xa024ab0) at src/mt_adaptor.c:333
#15 0x0038096e in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#16 0x00461a0e in clone () from /lib/tls/i686/cmov/libc.so.6
> Deadlock on close w/ zkpython / c client
> Key: ZOOKEEPER-763
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-763
> Project: Zookeeper
> Issue Type: Bug
> Components: c client, contrib-bindings
> Affects Versions: 3.3.0
> Environment: ubuntu 10.04, zookeeper 3.3.0 and trunk
> Reporter: Kapil Thangavelu
> Assignee: Mahadev konar
> Fix For: 3.4.0
> Attachments: deadlock.py, stack-trace-deadlock.txt
> deadlocks occur if we attempt to close a handle while there are any
> outstanding async requests (aget, acreate, etc). Normally on close both the
> io thread terminates and the completion thread are terminated and joined,
> however w\ith outstanding async requests, the completion thread won't be in a
> joinable state, and we effectively hang when the main thread does the join.
> afaics ideal behavior would be on close of a handle, to effectively clear out
> any remaining callbacks and let the completion thread terminate.
> i've tried adding some bookkeeping to within a python client to guard against
> closing while there is an outstanding async completion request, but its an
> imperfect solution since even after the python callback is executed there is
> still a window for deadlock before the completion thread finishes the
> a simple example to reproduce the deadlock is attached.
This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.