Reviewed: https://review.openstack.org/506458 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=f18202185d05e3f7e89fca6bbc17daf3c5dc4b98 Submitter: Jenkins Branch: master
commit f18202185d05e3f7e89fca6bbc17daf3c5dc4b98 Author: Matt Riedemann <[email protected]> Date: Thu Sep 21 22:25:53 2017 -0400 Remove allocations when unshelve fails on host When we unshelve an offloaded instance, the scheduler creates allocations in placement when picking a host. If the unshelve fails on the host, due to either the instance claim failing or the guest spawn failing, we need to remove the allocations since the instance isn't actually running on that host. Change-Id: Id2c7b7b3b4abda8a3b878fdee6806bcfe096e12e Closes-Bug: #1713796 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1713796 Title: Failed unshelve does not remove allocations from destination node Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) pike series: In Progress Bug description: During an unshelve from an offloaded instance, conductor will call the scheduler to pick a host. The scheduler will make allocations against the chosen node as part of that select_destinations() call. Then conductor casts to that compute host to unshelve the instance. If the spawn on the hypervisor fails while we've made the instance claim: https://github.com/openstack/nova/blob/16.0.0.0rc2/nova/compute/manager.py#L4485 Or even if the claim test fails, the allocations on the destination node aren't removed in Placement. The RT aborts the claim here: https://github.com/openstack/nova/blob/16.0.0.0rc2/nova/compute/resource_tracker.py#L414 That calls _update_usage_from_instance but doesn't change the has_ocata_computes kwarg so we get here: https://github.com/openstack/nova/blob/16.0.0.0rc2/nova/compute/resource_tracker.py#L1041 And we don't cleanup the allocations for the instance. The other case is if the claim fails, the instance_claim method will raise ComputeResourcesUnavailable which would be handled here: https://github.com/openstack/nova/blob/16.0.0.0rc2/nova/compute/claims.py#L161 https://github.com/openstack/nova/blob/16.0.0.0rc2/nova/compute/manager.py#L4491 But we don't remove allocations or do any other cleanup there. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1713796/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

