Reviewed: https://review.openstack.org/348492 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3405a28688eacbca23cf5cac0a611d33fb1a1f2c Submitter: Jenkins Branch: master
commit 3405a28688eacbca23cf5cac0a611d33fb1a1f2c Author: Roman Podoliaka <[email protected]> Date: Thu Jul 28 20:08:44 2016 +0300 rbd_utils: wrap blocking calls in tpool.Proxy() librbd is a Python binding around a C library, which is not aware of eventlet - all the calls to the functions from this library will block the whole nova-compute process for duration of a call. To make sure nova-compute remains responsive we need to wrap all the calls in tpool.Proxy() eventlet helper, that switches the execution context back to the event loop, while the call is executed in a native OS thread from a pool. Prefer tpool.Proxy() to tpool.execute() here as the former allows for wrapping objects and automatically executes all the method calls in native OS threads, while the latter needs to be applied to each method call in the code repeatedly. Existing calls are modified for the sake of consistency. Closes-Bug: #1607461 Change-Id: I743ab372332eb656258a476ae91f5e8fd2cbdc99 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1607461 Title: nova-compute hangs while executing a blocking call to librbd Status in OpenStack Compute (nova): Fix Released Bug description: While executing a call to librbd nova-compute may hang for a while (looks like at least some calls can take a really long time depending on the health of a Ceph cluster and things like http://docs.ceph.com/docs/master/rbd/librbdpy/#rbd.RBD.list are inherently slow down as the number of entities to be listed grows) and eventually go down in nova service-list output. strace'ing shows that a process is stuck on acquiring a mutex: root@node-153:~# strace -p 16675 Process 16675 attached futex(0x7fff084ce36c, FUTEX_WAIT_PRIVATE, 1, NULL gdb allows to see the traceback: http://paste.openstack.org/show/542534/ ^ which basically means calls to librbd (C library) are not monkey- patched and do not allow to switch the execution context to another green thread in an eventlet-based process. To avoid blocking of the whole nova-compute process on calls to librbd we should wrap them with tpool.execute() (http://eventlet.net/doc/threading.html#eventlet.tpool.execute) To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1607461/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

