Public bug reported: It happens in the CI from time to time that many tests are failing because router is in backup state all the time and it's never transitioned to be primary on the node.
Examples of the failure: https://3142cc95d58eb8a4ee07-043369ac575bbfe29758366f4ba498a1.ssl.cf1.rackcdn.com/765072/8/check/neutron-tempest-plugin-scenario-openvswitch/499b47d/controller/logs/screen-q-l3.txt https://6599da62140c9583e14a-cd7f53ffbb0b86c69deae453da021fe8.ssl.cf5.rackcdn.com/811746/4/check/neutron- tempest-plugin-scenario-openvswitch/3cafcd7/testr_results.html https://zuul.opendev.org/t/openstack/build/75c056464b6f445ebde18c1b07f5bcce Example of stacktrace: Traceback (most recent call last): File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/common/utils.py", line 80, in wait_until_true eventlet.sleep(sleep) File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/eventlet/greenthread.py", line 36, in sleep hub.switch() File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/eventlet/hubs/hub.py", line 313, in switch return self.greenlet.switch() eventlet.timeout.Timeout: 600 seconds During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/scenario/test_basic.py", line 35, in test_basic_instance self.setup_network_and_server() File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/scenario/base.py", line 281, in setup_network_and_server router = self.create_router_by_client(**kwargs) File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/scenario/base.py", line 209, in create_router_by_client cls._wait_for_router_ha_active(router['id']) File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/scenario/base.py", line 228, in _wait_for_router_ha_active utils.wait_until_true(_router_active_on_l3_agent, File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/common/utils.py", line 84, in wait_until_true raise exception tempest.lib.exceptions.TimeoutException: Request timed out Details: Router 1c4ce297-5a04-4794-9720-20fdec9ca4e5 is not active on any of the L3 agents ** Affects: neutron Importance: High Status: Confirmed ** Tags: gate-failure l3-ha -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1946187 Title: HA routers not going to be "primary" at all Status in neutron: Confirmed Bug description: It happens in the CI from time to time that many tests are failing because router is in backup state all the time and it's never transitioned to be primary on the node. Examples of the failure: https://3142cc95d58eb8a4ee07-043369ac575bbfe29758366f4ba498a1.ssl.cf1.rackcdn.com/765072/8/check/neutron-tempest-plugin-scenario-openvswitch/499b47d/controller/logs/screen-q-l3.txt https://6599da62140c9583e14a-cd7f53ffbb0b86c69deae453da021fe8.ssl.cf5.rackcdn.com/811746/4/check/neutron- tempest-plugin-scenario-openvswitch/3cafcd7/testr_results.html https://zuul.opendev.org/t/openstack/build/75c056464b6f445ebde18c1b07f5bcce Example of stacktrace: Traceback (most recent call last): File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/common/utils.py", line 80, in wait_until_true eventlet.sleep(sleep) File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/eventlet/greenthread.py", line 36, in sleep hub.switch() File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/eventlet/hubs/hub.py", line 313, in switch return self.greenlet.switch() eventlet.timeout.Timeout: 600 seconds During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/scenario/test_basic.py", line 35, in test_basic_instance self.setup_network_and_server() File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/scenario/base.py", line 281, in setup_network_and_server router = self.create_router_by_client(**kwargs) File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/scenario/base.py", line 209, in create_router_by_client cls._wait_for_router_ha_active(router['id']) File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/scenario/base.py", line 228, in _wait_for_router_ha_active utils.wait_until_true(_router_active_on_l3_agent, File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/neutron_tempest_plugin/common/utils.py", line 84, in wait_until_true raise exception tempest.lib.exceptions.TimeoutException: Request timed out Details: Router 1c4ce297-5a04-4794-9720-20fdec9ca4e5 is not active on any of the L3 agents To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1946187/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

