Reviewed: https://review.openstack.org/468517 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=98b8e39ac5f7b3f2bb06ca415bbb806705461d74 Submitter: Jenkins Branch: master
commit 98b8e39ac5f7b3f2bb06ca415bbb806705461d74 Author: Chris Dent <[email protected]> Date: Fri May 26 19:12:46 2017 +0000 Catch InstanceNotFound when deleting allocations If the list of allocations that are going to be deleted includes an instance that can no longer be loaded by get_by_uuid (because something has already removed it) we need to catch InstanceNotFound and remove the allocations as planned. Without this change, the exception is not caught, causing an error and leaving behind the allocations we don't want to be there. Change-Id: I204b077c287141a7f5643b2cc0065da2ba395c03 Closes-Bug: #1693899 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1693899 Title: Cleaning up stale resource allocations can fail with InstanceNotFound Status in OpenStack Compute (nova): Fix Released Bug description: Noticed what appears to be a race on delete of an instance where we're deleting resource provider allocations records for a deleted instance and when we go to get the instance from the database it's already deleted: http://logs.openstack.org/32/468232/1/gate/gate-tempest-dsvm-cells- ubuntu- xenial/ad969ba/logs/screen-n-cpu.txt.gz?level=TRACE#_May_26_17_29_52_292260 May 26 17:29:52.292260 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager [req-c0f69b9e-c6eb-43e1-9bc9-372a42901987 None None] Error updating resources for node ubuntu-xenial-ovh-bhs1-9010496. May 26 17:29:52.292537 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager Traceback (most recent call last): May 26 17:29:52.292730 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/manager.py", line 6588, in update_available_resource_for_node May 26 17:29:52.292908 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager rt.update_available_resource(context, nodename) May 26 17:29:52.293082 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/resource_tracker.py", line 626, in update_available_resource May 26 17:29:52.293266 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager self._update_available_resource(context, resources) May 26 17:29:52.293430 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 271, in inner May 26 17:29:52.293592 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager return f(*args, **kwargs) May 26 17:29:52.293750 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/resource_tracker.py", line 667, in _update_available_resource May 26 17:29:52.293909 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager self._update_usage_from_instances(context, instances, nodename) May 26 17:29:52.294070 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/resource_tracker.py", line 1047, in _update_usage_from_instances May 26 17:29:52.294235 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager self._remove_deleted_instances_allocations(context, cn) May 26 17:29:52.294453 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/resource_tracker.py", line 1061, in _remove_deleted_instances_allocations May 26 17:29:52.294628 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager expected_attrs=[]) May 26 17:29:52.294797 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 177, in wrapper May 26 17:29:52.294983 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager args, kwargs) May 26 17:29:52.295107 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/conductor/rpcapi.py", line 240, in object_class_action_versions May 26 17:29:52.295205 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager args=args, kwargs=kwargs) May 26 17:29:52.295290 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call May 26 17:29:52.295374 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager retry=self.retry) May 26 17:29:52.295457 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send May 26 17:29:52.295547 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager timeout=timeout, retry=retry) May 26 17:29:52.295630 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 505, in send May 26 17:29:52.295712 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager retry=retry) May 26 17:29:52.295795 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 496, in _send May 26 17:29:52.295877 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager raise result May 26 17:29:52.295959 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager InstanceNotFound_Remote: Instance 9ecdbc52-8f98-45aa-9814-ade07faa1026 could not be found. We should handle the InstanceNotFound since in this path we are cleaning up anyway. Looks like this is triggered by something semi-recent: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Error%20updating%20resources%20for%20node%5C%22%20AND%20message%3A%5C%22_remove_deleted_instances_allocations%5C%22%20AND%20message%3A%5C%22InstanceNotFound_Remote%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1693899/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

