*** This bug is a duplicate of bug 1934937 ***
https://bugs.launchpad.net/bugs/1934937
This is a duplicate of [1] (also check [2] and [3]). For non-wsgi
services, the "oslo_messaging_rabit.heartbeat_in_pthread" option should
be "False".
[1]https://bugs.launchpad.net/oslo.messaging/+bug/1934937
[2]https://bugs.launchpad.net/tripleo/+bug/1984076
[3]https://bugzilla.redhat.com/show_bug.cgi?id=2115383
** Bug watch added: Red Hat Bugzilla #2115383
https://bugzilla.redhat.com/show_bug.cgi?id=2115383
** This bug has been marked a duplicate of bug 1934937
Heartbeat in pthreads in nova-wallaby crashes with greenlet error
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1965140
Title:
Eventlet fails when starting network agents
Status in neutron:
Incomplete
Bug description:
I have a two nodes openstack setup, where one node (w3) runs all
controller services (keystone, glance, placement, nova, neutron,
horizon, cinder) as well as nova-compute and cinder-volume and the
second (w6) runs nova-compute and linuxbridge agent.
All network agents on w3 are dead
[root@w3 ~]# openstack network agent list
+--------------------------------------+--------------------+---------------+-------------------+-------+-------+---------------------------+
| ID | Agent Type | Host |
Availability Zone | Alive | State | Binary |
+--------------------------------------+--------------------+---------------+-------------------+-------+-------+---------------------------+
| 330269b7-b73c-4207-abc7-21f1a2972b7b | Linux bridge agent | w6.int.lunarc |
None | :-) | UP | neutron-linuxbridge-agent |
| 83d16241-8a3a-42b0-beda-87246d945dc1 | L3 agent | w3.int.lunarc |
nova | XXX | UP | neutron-l3-agent |
| a52ab60f-d893-491d-a43e-823a0d482810 | Linux bridge agent | w3.int.lunarc |
None | XXX | UP | neutron-linuxbridge-agent |
| abd75644-d895-41ae-94fa-6c4351cbc4bf | Metadata agent | w3.int.lunarc |
None | XXX | UP | neutron-metadata-agent |
| c05c65bc-779e-4fe5-a19e-350c44900be4 | DHCP agent | w3.int.lunarc |
nova | XXX | UP | neutron-dhcp-agent |
+--------------------------------------+--------------------+---------------+-------------------+-------+-------+---------------------------+
, and I cannot start them anymore. I tried restarting said agent
alone, restarting all openstack daemon on w3 and even restarting the
whole node but nothing seems to help and I always have teh same issue
and the same trace as show below.
I could not find any useful info in the logs, but systemd does report an
issue with eventlet/greenlet:
[root@w3 ~]# journalctl -fu neutron-linuxbridge-agent
-- Logs begin at Wed 2022-03-16 04:32:31 EDT. --
Mar 16 09:14:09 w3.int.lunarc sudo[37085]: neutron : TTY=unknown ; PWD=/ ;
USER=root ; COMMAND=/bin/neutron-rootwrap /etc/neutron/rootwrap.conf
privsep-helper --config-file /usr/share/neutron/neutron-dist.conf --config-file
/etc/neutron/neutron.conf --config-file
/etc/neutron/plugins/ml2/linuxbridge_agent.ini --config-dir
/etc/neutron/conf.d/neutron-linuxbridge-agent --privsep_context
neutron.privileged.default --privsep_sock_path /tmp/tmp93tzwqg3/privsep.sock
Mar 16 09:14:12 w3.int.lunarc sudo[37107]: neutron : TTY=unknown ; PWD=/ ;
USER=root ; COMMAND=/bin/neutron-rootwrap /etc/neutron/rootwrap.conf
privsep-helper --config-file /usr/share/neutron/neutron-dist.conf --config-file
/etc/neutron/neutron.conf --config-file
/etc/neutron/plugins/ml2/linuxbridge_agent.ini --config-dir
/etc/neutron/conf.d/neutron-linuxbridge-agent --privsep_context
neutron.privileged.link_cmd --privsep_sock_path /tmp/tmp81iy5eni/privsep.sock
Mar 16 09:14:15 w3.int.lunarc neutron-linuxbridge-agent[37073]: Traceback
(most recent call last):
Mar 16 09:14:15 w3.int.lunarc neutron-linuxbridge-agent[37073]: File
"/usr/lib/python3.6/site-packages/eventlet/hubs/hub.py", line 476, in
fire_timers
Mar 16 09:14:15 w3.int.lunarc neutron-linuxbridge-agent[37073]: timer()
Mar 16 09:14:15 w3.int.lunarc neutron-linuxbridge-agent[37073]: File
"/usr/lib/python3.6/site-packages/eventlet/hubs/timer.py", line 59, in __call__
Mar 16 09:14:15 w3.int.lunarc neutron-linuxbridge-agent[37073]: cb(*args,
**kw)
Mar 16 09:14:15 w3.int.lunarc neutron-linuxbridge-agent[37073]: File
"/usr/lib/python3.6/site-packages/eventlet/semaphore.py", line 152, in
_do_acquire
Mar 16 09:14:15 w3.int.lunarc neutron-linuxbridge-agent[37073]:
waiter.switch()
Mar 16 09:14:15 w3.int.lunarc neutron-linuxbridge-agent[37073]:
greenlet.error: cannot switch to a different thread
I am running OpenStack Xena on CentOS Stream 8 freshly installed. Here are
other details:
[root@w3 ~]# uname -a
Linux w3.int.lunarc 4.18.0-365.el8.x86_64 #1 SMP Thu Feb 10 16:11:23 UTC 2022
x86_64 x86_64 x86_64 GNU/Linux
Any clue on how I can find out what makes this happen, or just how I
can get past this crippling greenlet/eventlet error, and get these
agents to run again?
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1965140/+subscriptions
--
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help : https://help.launchpad.net/ListHelp