Reviewed: https://review.opendev.org/660761 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d4ed0d8b7adc350e8962df033c2da892c95561fe Submitter: Zuul Branch: master
commit d4ed0d8b7adc350e8962df033c2da892c95561fe Author: Arnaud Morin <[email protected]> Date: Wed May 22 17:34:20 2019 +0200 Refresh instance network info on deletion When deleting an instance, if the network info is empty, we should refresh the info because we can't be sure the copy of the cache we have when we fetched the instance to delete is up-to-date, i.e. if we're racing to delete the server while it's building and the network info cache was updated in the database after we started the delete operation and got the instance from the DB, then we could fail to unplug VIFs. Closes-Bug: #1830081 Change-Id: I99601773406c61f93002e2f7cbb248cf73cef0ab Signed-off-by: Arnaud Morin <[email protected]> ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1830081 Title: Nova unplug interface race condition when deleting an instance Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: Confirmed Status in OpenStack Compute (nova) rocky series: Confirmed Status in OpenStack Compute (nova) stein series: Confirmed Bug description: Description =========== When nova start an instance, it asks neutron to create a port and then update the instance info cache based on information from neutron. If, in the middle of the spawning, the instance is getting deleted, the terminate_instance function is called with an instance object that DOES NOT contain any network info. As a result, nova is deleting the instance but is never unplugging the interface. Step to reproduce ================= I am booting an instance and immediately deleting it thanks to a command like: $ openstack server create --key-name fake --image ubuntu1810 --flavor c2-7 --net Ext-Net arnaudubuntu1810-3 ; nova delete arnaudubuntu1810-3 - [1] build_and_run_instance is executed, with a semaphore, thus, locking the instance. When booting, nova will fill the network_info cache, by calling [2] update_instance_cache_with_nw_info. - [3] terminate_instance is executed few seconds later, but is waiting for the semaphore to be released. At this time, the instance network_info cache may not be filled, depending if the [2] update_instance_cache_with_nw_info has already been executed or not. - If we follow the code, we end up at _shutdown_instance [4], which is doing a call to [5] get_network_info, which is returning a NetworkInfo object that contains no interface. - At the end, nova is calling _unplug_vifs [6] which is doing nothing (no vif) Note that I am running OpenStack Newton release, but the code involved seems identical on master. [1] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1837 [2] https://github.com/openstack/nova/blob/master/nova/network/base_api.py#L34 [2] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2765 [4] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2559 [5] https://github.com/openstack/nova/blob/master/nova/objects/instance.py#L1252 [6] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L919 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1830081/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

