I know what the problem is:
(9:59:34 AM) mriedem: set_inventory_for_provider -> _ensure_resource_provider
-> _create_resource_provider -> safe_connect returns None because it can't talk
to placement yet
(9:59:41 AM) mriedem:
https://review.openstack.org/#/c/524618/2/nova/scheduler/client/report.py@516
(9:59:44 AM) mriedem: so we put None in the cache
** Changed in: nova
Status: New => Triaged
** Also affects: nova/pike
Importance: Undecided
Status: New
** Also affects: nova/queens
Importance: Undecided
Status: New
** Changed in: nova
Assignee: (unassigned) => Matt Riedemann (mriedem)
** Changed in: nova
Importance: Undecided => High
** Changed in: nova/queens
Status: New => Triaged
** Changed in: nova/pike
Importance: Undecided => Medium
** Changed in: nova/queens
Importance: Undecided => Medium
** Changed in: nova/pike
Status: New => Triaged
** Changed in: nova
Importance: High => Medium
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1767139
Title:
TypeError in _get_inventory_and_update_provider_generation
Status in OpenStack Compute (nova):
Triaged
Status in OpenStack Compute (nova) pike series:
Triaged
Status in OpenStack Compute (nova) queens series:
Triaged
Bug description:
Description
===========
Bringing up a new cluster as part of our CI after switch from 16.1.0
to 16.1.1 on Centos, I'm seeing this error on some computes:
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager Traceback (most
recent call last):
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6752, in
update_available_resource_for_node
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager
rt.update_available_resource(context, nodename)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 704,
in update_available_resource
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager
self._update_available_resource(context, resources)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File
"/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in
inner
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager return f(*args,
**kwargs)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 728,
in _update_available_resource
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager
self._init_compute_node(context, resources)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 585,
in _init_compute_node
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager
self._update(context, cn)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 886,
in _update
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager inv_data,
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File
"/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 64,
in set_inventory_for_provider
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager inv_data,
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File
"/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 37,
in __run_method
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager return
getattr(self.instance, __name)(*args, **kwargs)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File
"/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 789,
in set_inventory_for_provider
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager
self._update_inventory(rp_uuid, inv_data)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File
"/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 56, in
wrapper
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager return f(self,
*a, **k)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File
"/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 675,
in _update_inventory
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager if
self._update_inventory_attempt(rp_uuid, inv_data):
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File
"/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 562,
in _update_inventory_attempt
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager curr =
self._get_inventory_and_update_provider_generation(rp_uuid)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File
"/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 546,
in _get_inventory_and_update_provider_generation
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager if server_gen !=
my_rp['generation']:
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager TypeError:
'NoneType' object has no attribute '__getitem__'
The error seems persistent for a single run of nova-compute.
Steps to reproduce
==================
Nodes were started by our CI infrastructure. We start 3 computes and
a single control node. In 50% of cases, one of the computes comes up
in this bad state.
Expected result
===============
Working cluster.
Actual result
=============
At least one of 3 nodes fails to join the cluster, it's not picked up
by discover_hosts and I see the above stack trace repeated in the
nova-compute logs.
Environment
===========
1. Exact version of OpenStack you are running. See the following
list for all releases: http://docs.openstack.org/releases/
$ rpm -qa | grep nova
python-nova-16.1.1-1.el7.noarch
openstack-nova-common-16.1.1-1.el7.noarch
python2-novaclient-9.1.1-1.el7.noarch
openstack-nova-api-16.1.1-1.el7.noarch
openstack-nova-compute-16.1.1-1.el7.noarch
2. Which hypervisor did you use?
(For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
What's the version of that?
$ rpm -qa | grep kvm
libvirt-daemon-kvm-3.2.0-14.el7_4.9.x86_64
qemu-kvm-common-ev-2.9.0-16.el7_4.14.1.x86_64
qemu-kvm-ev-2.9.0-16.el7_4.14.1.x86_64
2. Which storage type did you use?
(For example: Ceph, LVM, GPFS, ...)
What's the version of that?
Not sure
3. Which networking type did you use?
(For example: nova-network, Neutron with OpenVSwitch, ...)
Neutron with Calico (I work on Calico, this is our CI system)
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1767139/+subscriptions
--
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help : https://help.launchpad.net/ListHelp