Reviewed: https://review.openstack.org/352554 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e285eb1a382e6d3ce1cc596eeb5cecb3b165a228 Submitter: Jenkins Branch: master
commit e285eb1a382e6d3ce1cc596eeb5cecb3b165a228 Author: Matt Riedemann <[email protected]> Date: Mon Aug 8 14:33:37 2016 -0400 Run shelve/shelve_offload_instance in a semaphore When an instance is shelved, by default it is immediately offloaded because CONF.shelved_offload_time defaults to 0. When a shelved instance is offloaded, it's destroyed and it's host/node values are nulled out. Unshelving an instance is basically the same flow as building an instance for the first time. The instance.host/node values are set in the resource tracker when claiming resources. Tempest has some tests which use a shared server resource and perform actions on that shared server. These tests are triggering a race when unshelve is called while the compute is offloading the shelved instance. The race hits a window where unshelve is running before shelve_offload_instance nulls out the instance host/node values. The resource claim during unshelve sets the host/node values (which were actually already set) and then shelve_offload_instance nulls them out. The unshelve operation sets the instance.vm_state to ACTIVE, however. So Tempest sees an instance that's ACTIVE and thinks it can run the next action test on it, for example 'suspend'. This fails because the instance.host isn't set (from shelve_offload_instance) and the test fails in the compute API. To close the race window, we add a lock to shelve_instance and shelve_offload_instance to match the lock that's in unshelve_instance. This way when unshelve is called it will wait until the shelve_offload_instance operation is complete and the instance.host value is nulled out. Closes-Bug: #1611008 Change-Id: Id36b3b9516d72d28519c18c38d98b646b47d288d ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1611008 Title: ServersNegativeTestJSON.test_suspend_server_invalid_state fails with "NovaException: Unable to find host for Instance" Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) mitaka series: Confirmed Bug description: Seen here: http://logs.openstack.org/91/327191/22/check/gate-tempest-dsvm-full- ubuntu- xenial/6a11005/logs/screen-n-api.txt.gz?level=TRACE#_2016-08-06_23_42_20_711 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions [req-86f8ef99-a9e4-49f0-8258-a0f40dea4a51 tempest-ServersNegativeTestJSON-16764714 tempest-ServersNegativeTestJSON-16764714] Unexpected exception in API method 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions Traceback (most recent call last): 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/api/openstack/extensions.py", line 338, in wrapped 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions return f(*args, **kwargs) 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/api/openstack/compute/suspend_server.py", line 41, in _suspend 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions self.compute_api.suspend(context, server) 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/compute/api.py", line 164, in inner 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions return function(self, context, instance, *args, **kwargs) 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/compute/api.py", line 171, in _wrapped 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions return fn(self, context, instance, *args, **kwargs) 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/compute/api.py", line 145, in inner 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions return f(self, context, instance, *args, **kw) 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/compute/api.py", line 2782, in suspend 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions self.compute_rpcapi.suspend_instance(context, instance) 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/compute/rpcapi.py", line 953, in suspend_instance 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions server=_compute_host(None, instance), version=version) 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/compute/rpcapi.py", line 53, in _compute_host 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions 'Instance %s') % instance.uuid) 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions NovaException: Unable to find host for Instance 480c8b6f-ab2d-4a49-8344-a3a679b01472 2016-08-06 23:42:20.711 10592 ERROR nova.api.openstack.extensions http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22NovaException%3A%20Unable%20to%20find%20host%20for%20Instance%5C%22%20AND%20tags%3A%5C%22screen-n-api.txt%5C%22&from=7d There are 4 hits in 7 days, check and gate, all failures, all on the master branch. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1611008/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

