Reviewed: https://review.openstack.org/563418 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c6dda9f8064e0fcdf7ac85d08ed68554083be74b Submitter: Zuul Branch: master
commit c6dda9f8064e0fcdf7ac85d08ed68554083be74b Author: MultipleCrashes <[email protected]> Date: Wed Jul 4 16:55:22 2018 +0530 Retry decorator fix for instances which go into ERROR state during bulk delete During bulk delete, some of the machines go into ERROR state rather than being deleted. This happens once in a while during deletion of huge number of machines concurrently. The failure occurs during deallocation of network. At a later point in time the ERROR state gets cleared if the user deletes the instance manually later on. This fix proposes for retries for certain number of time with variable delay to allow proper deallocation of network. Change-Id: I32212b4d8180e947fdc958449aebd822f50e97fd Closes-Bug: #1765942 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1765942 Title: Delete issued to building server leaves the instance in error state due to " not able to deallocate network" error . Status in OpenStack Compute (nova): Fix Released Bug description: Delete issued to building server leaves the instance in error state due to " not able to deallocate network" error.During bulk delete of instances, some of the machines go into ERROR state rather than being deleted.This happens intermittently during deletion of huge number of instances at once.Log line has this entry "Setting instance vm_state to ERROR".Failure in do_terminate_instance.The cause of the failure seems to be the ConnectFailure as the traceback suggests that the caller is not able to establish connection to one of the endpoints. Here is the traceback from the failure: Error Log => Lock "compute_resources" released by "nova.compute.resource_tracker._update_available_resource" :: held 0.332s inner nova/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:282 => [instance: instance_id] Failed to deallocate network for instance. => Running periodic task ComputeManager._poll_rescued_instances run_periodic_tasks nova/lib/python2.7/site-packages/oslo_service/periodic_task.py:215 => CAST unique_id: exchange 'nova' topic 'cells' _send nova/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:454 => CAST unique_id: NOTIFY exchange 'nova' topic 'notifications.info' _send nova/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:438 => CAST unique_id: NOTIFY exchange 'nova' topic 'monitor.info' _send nova/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:438 => [instance: instance_id] Setting instance vm_state to ERROR Traceback (most recent call last): File "nova/ self._delete_instance(context, instance, bdms, quotas) File "nova/lib/python2.7/site-packages/nova/hooks.py", line 154, in inner rv = f(*args, **kwargs) File "nova/lib/python2.7/site-packages/nova/compute/manager.py", line 2520, in _delete_instance 6 quotas.rollback() File "nova/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ self.force_reraise() File "nova/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise six.reraise(self.type_, self.value, self.tb) File "nova/lib/python2.7/site-packages/nova/compute/manager.py", line 2496, in _delete_instance self._shutdown_instance(context, instance, bdms) File "nova/lib/python2.7/site-packages/nova/compute/manager.py", line 2393, in _shutdown_instance self._try_deallocate_network(context, instance, requested_networks) File "nova/lib/python2.7/site-packages/nova/compute/manager.py", line 2317, in _try_deallocate_network self._set_instance_obj_error_state(context, instance) File "nova/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ self.force_reraise() File "nova/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise six.reraise(self.type_, self.value, self.tb) File "nova/lib/python2.7/site-packages/nova/compute/manager.py", line 2312, in _try_deallocate_network self._deallocate_network(context, instance, requested_networks) File "nova/lib/python2.7/site-packages/nova/compute/manager.py", line 1858, in _deallocate_network context, instance, requested_networks=requested_networks) File "nova/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 819, in deallocate_for_instance data = neutron.list_ports(**search_opts) File "nova/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 618, in list_ports **_params) File "nova/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 357, in list for r in self._pagination(collection, path, **params): File "nova/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 372, in _pagination res = self.get(path, params=params) File "nova/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 342, in get headers=headers, params=params) File "nova/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 319, in retry_request headers=headers, params=params) File "nova/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 270, in do_request resp, replybody = self.httpclient.do_request(action, method, body=body) File "nova/lib/python2.7/site-packages/neutronclient/client.py", line 322, in do_request return self.request(url, method, **kwargs) File "nova/lib/python2.7/site-packages/neutronclient/client.py", line 310, in request resp = super(SessionClient, self).request(*args, **kwargs) File "nova/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 106, in request return self.session.request(url, method, **kwargs) File "nova/lib/python2.7/site-packages/positional/__init__.py", line 101, in inner return wrapped(*args, **kwargs) File "nova/lib/python2.7/site-packages/keystoneauth1/session.py", line 452, in request resp = send(**kwargs) File "nova/lib/python2.7/site-packages/keystoneauth1/session.py", line 501, in _send_request raise exceptions.ConnectFailure(msg) ConnectFailure: Unable to establish connection to http:/somehost:someport/v2.0/ports.json?device_id=someid 6854139:2018-02-21 16:43:19.551 26180 ERROR nova.compute.manager [instance: instance_id] To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1765942/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

