Reviewed: https://review.openstack.org/560626 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=a80ac96362c8fafba1bfe71244b52ba2f082c86e Submitter: Zuul Branch: master
commit a80ac96362c8fafba1bfe71244b52ba2f082c86e Author: Matt Riedemann <[email protected]> Date: Wed Apr 11 16:00:59 2018 -0400 Add functional test for deleting a compute service This adds a functional test which asserts the things related to bug 1756179 where deleting a compute service does not also delete the related host mapping or resource provider resources. Also related to bug 1763183 in that it should not be possible to delete a compute service that has instances running on it since that will mess up resource tracking in Placement. Change-Id: I519c5abfe24b154998f481c8a86db239a75d4729 Related-Bug: #1756179 Related-Bug: #1763183 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1763183 Title: DELETE /os-services/{service_id} does not block for hosted instances Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) pike series: Confirmed Status in OpenStack Compute (nova) queens series: Confirmed Bug description: This came up while reviewing the fix for bug 1756179: https://review.openstack.org/#/c/554920/6/nova/api/openstack/compute/services.py@226 Full IRC conversation is here: http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack- nova.2018-04-11.log.html#t2018-04-11T20:32:13 The summary is that it's possible to delete a compute service and it's associated compute node record even if that compute node has instances on it. Before placement, this wasn't a huge problem because you could evacuate the instances to another host or if you brought the host back up, it will recreate the service and compute node and the resource tracker will "heal" itself by finding instances running on that host and node combo: https://github.com/openstack/nova/blob/2c5da2212c3fa3e589c4af171486a2097fd8c54e/nova/compute/resource_tracker.py#L714 The problem is after we started requiring placement, and creating allocations in the scheduler in Pike, those allocations are against the compute_nodes.uuid for the compute node resource provider. If the service and it's related compute node record are deleted, restarting the service will create a new service and compute node record with a new UUID which will result in a new resource provider in placement, and the instances running on that host will have allocations against the now orphaned resource provider. The new resource provider will be reporting incorrect consumption so scheduling will also be affected. So we should block deleting a compute service (and it's node) here: https://github.com/openstack/nova/blob/2c5da2212c3fa3e589c4af171486a2097fd8c54e/nova/api/openstack/compute/services.py#L213 If that host (node) has instances on it. This problem goes back to Pike. Ocata is OK in that the resource tracker on Ocata computes will "heal" allocations during the update_available_resource periodic task (and when the compute service starts up), and in Ocata the FilterScheduler does not create allocations in Placement. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1763183/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

