Reviewed: https://review.openstack.org/560706 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ea9d0af31395fbe1686fa681cd91226ee580796e Submitter: Zuul Branch: master
commit ea9d0af31395fbe1686fa681cd91226ee580796e Author: Matt Riedemann <[email protected]> Date: Wed Apr 11 21:24:43 2018 -0400 Delete allocations from API if nova-compute is down When performing a "local delete" of an instance, we need to delete the allocations that the instance has against any resource providers in Placement. It should be noted that without this change, restarting the nova-compute service will delete the allocations for its compute node (assuming the compute node UUID is the same as before the instance was deleted). That is shown in the existing functional test modified here. The more important reason for this change is that in order to fix bug 1756179, we need to make sure the resource provider allocations for a given compute node are gone by the time the compute service is deleted. This adds a new functional test and a release note for the new behavior and need to configure nova-api for talking to placement, which is idempotent if not configured thanks to the @safe_connect decorator used in SchedulerReportClient. Closes-Bug: #1679750 Related-Bug: #1756179 Change-Id: If507e23f0b7e5fa417041c3870d77786498f741d ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1679750 Title: Allocations are not cleaned up in placement for instance 'local delete' case Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) pike series: Confirmed Status in OpenStack Compute (nova) queens series: Confirmed Bug description: This is semi-related to bug 1661312 for evacuate. This is the case: 1. Create an instance on host A successfully. There are allocation records in the placement API for the instance (consumer for the allocation records) and host A (resource provider). 2. Host A goes down. 3. Delete the instance. This triggers the local delete flow in the compute API where we can't RPC cast to the compute to delete the instance because the nova-compute service is down. So we do the delete in the database from the compute API (local to compute API, hence local delete). The problem is in #3 we don't remove the allocations for the instance from the host A resource provider during the local delete flow. Maybe this doesn't matter while host A is down, since the scheduler can't schedule to it anyway. But if host A comes back up, it will have allocations tied to it for deleted instances. On init_host in the compute service we call _complete_partial_deletion but that's only for instances with a vm_state of 'deleted' but aren't actually deleted in the database. I don't think that's going to cover this case because the local delete code in the compute API calls instance.destroy() which deletes the instance from the database (updates instances.deleted != 0 in the DB so it's "soft" deleted). To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1679750/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

