Reviewed: https://review.opendev.org/673463 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=9ad54f3dacbd372271f441baea5380f913072dde Submitter: Zuul Branch: master
commit 9ad54f3dacbd372271f441baea5380f913072dde Author: Lee Yarwood <[email protected]> Date: Mon Jul 29 16:25:45 2019 +0100 compute: Take an instance.uuid lock when rebooting Previously simultaneous requests to reboot and delete an instance could race as only the latter took a lock against the uuid of the instance. With the Libvirt driver this race could potentially result in attempts being made to reconnect previously disconnected volumes on the host. Depending on the volume backend being used this could then result in stale block devices point to unmapped volumes being left on the host that in turn could cause failures later on when connecting newly mapped volumes. This change avoids this race by ensuring any request to reboot an instance takes an instance.uuid lock within the compute manager, serialising requests to reboot and then delete the instance. Closes-Bug: #1838392 Change-Id: Ieb59de10c63bb067f92ec054535766cdd722dae2 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1838392 Title: BDMNotFound raised and stale block devices left over when simultaneously reboot and deleting an instance Status in OpenStack Compute (nova): Fix Released Bug description: Description =========== Simultaneous requests to reboot and delete an instance _will_ race as only the call to delete takes a lock against the instance.uuid. One possible outcome of this seen in the wild with the Libvirt driver is that the request to soft reboot will eventually turn into a hard reboot, reconnecting volumes that the delete request has already disconnected. These volumes will eventually be unmapped on the Cinder side by the delete request leaving stale devices on the host. Additionally BDMNotFound is raised by the reboot operation as the delete operation has already deleted the BDMs. Steps to reproduce ================== $ nova reboot $instance && nova delete $instance Expected result =============== The instance reboots and is then deleted without any errors raised. Actual result ============= BDMNotFound raised and stale block devices left over. Environment =========== 1. Exact version of OpenStack you are running. See the following list for all releases: http://docs.openstack.org/releases/ 1599e3cf68779eafaaa2b13a273d3bebd1379c19 / 19.0.0.0rc1-992-g1599e3cf68 2. Which hypervisor did you use? (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...) What's the version of that? Libvirt + QEMU/kvm 2. Which storage type did you use? (For example: Ceph, LVM, GPFS, ...) What's the version of that? N/A 3. Which networking type did you use? (For example: nova-network, Neutron with OpenVSwitch, ...) N/A Logs & Configs ============== To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1838392/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

