Reviewed: https://review.opendev.org/675704 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=f578146f372386e1889561cba33e95495e66ce97 Submitter: Zuul Branch: master
commit f578146f372386e1889561cba33e95495e66ce97 Author: Matt Riedemann <[email protected]> Date: Fri Aug 9 17:17:45 2019 -0400 rt: only map compute node if we created it If ComputeNode.create() fails, the update_available_resource periodic will not try to create it again because it will be mapped in the compute_nodes dict and _init_compute_node will return early but trying to save changes to that ComputeNode object later will fail because there is no id on the object, since we failed to create it in the DB. This simply reverses the logic such that we only map the compute node if we successfully created it. Some existing _init_compute_node testing had to be changed since it relied on the order of when the ComputeNode object is created and put into the compute_nodes dict in order to pass the object along to some much lower-level PCI tracker code, which was arguably mocking too deep for a unit test. That is changed to avoid the low-level mocking and assertions and just assert that _setup_pci_tracker is called as expected. Change-Id: I9fa1d509a3de405d6246fb8670612c65c10cc93b Closes-Bug: #1839674 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1839674 Title: ResourceTracker.compute_nodes won't try to create a ComputeNode a second time if the first create() fails Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) ocata series: Triaged Status in OpenStack Compute (nova) pike series: Triaged Status in OpenStack Compute (nova) queens series: Triaged Status in OpenStack Compute (nova) rocky series: In Progress Status in OpenStack Compute (nova) stein series: In Progress Bug description: I found this while writing a functional recreate test for bug 1839560. As of this change in Ocata: https://github.com/openstack/nova/commit/1c967593fbb0ab8b9dc8b0b509e388591d32f537 The ResourceTracker.compute_nodes dict will store the ComputeNode object *before* trying to create it: https://github.com/openstack/nova/blob/6b7d0caad86fe32ffc49a8672de1eb7258f3b919/nova/compute/resource_tracker.py#L570-L571 The problem is if ComputeNode.create() fails for whatever reason, the next run through update_available_resource won't try to create the ComputeNode again because of this: https://github.com/openstack/nova/blob/6b7d0caad86fe32ffc49a8672de1eb7258f3b919/nova/compute/resource_tracker.py#L546 And eventually you get errors like this: b'2019-08-09 17:02:59,356 ERROR [nova.compute.manager] Error updating resources for node node2.' b'Traceback (most recent call last):' b' File "/home/osboxes/git/nova/nova/compute/manager.py", line 8250, in _update_available_resource_for_node' b' startup=startup)' b' File "/home/osboxes/git/nova/nova/compute/resource_tracker.py", line 715, in update_available_resource' b' self._update_available_resource(context, resources, startup=startup)' b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_concurrency/lockutils.py", line 328, in inner' b' return f(*args, **kwargs)' b' File "/home/osboxes/git/nova/nova/compute/resource_tracker.py", line 796, in _update_available_resource' b' self._update(context, cn, startup=startup)' b' File "/home/osboxes/git/nova/nova/compute/resource_tracker.py", line 1052, in _update' b' self.old_resources[nodename] = old_compute' b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__' b' self.force_reraise()' b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise' b' six.reraise(self.type_, self.value, self.tb)' b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/six.py", line 693, in reraise' b' raise value' b' File "/home/osboxes/git/nova/nova/compute/resource_tracker.py", line 1046, in _update' b' compute_node.save()' b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 226, in wrapper' b' return fn(self, *args, **kwargs)' b' File "/home/osboxes/git/nova/nova/objects/compute_node.py", line 352, in save' b' db_compute = db.compute_node_update(self._context, self.id, updates)' b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 67, in getter' b' self.obj_load_attr(name)' b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 603, in obj_load_attr' b' _("Cannot load \'%s\' in the base class") % attrname)' b"NotImplementedError: Cannot load 'id' in the base class" We should only map the ComputeNode when we've successfully created it. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1839674/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

