** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1916024
Title: HA router master instance in error state because qg-xx interface is down Status in neutron: Fix Released Bug description: BZ reference: https://bugzilla.redhat.com/show_bug.cgi?id=1929829 Sometimes a router is created with all the instances in standby mode because the qg-xx interface is in down state and there isn't connectivity: (overcloud) [stack@undercloud-0 ~]$ neutron l3-agent-list-hosting-router router1 neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. +--------------------------------------+---------------------------+----------------+-------+----------+ | id | host | admin_state_up | alive | ha_state | +--------------------------------------+---------------------------+----------------+-------+----------+ | 3b93ec23-48fa-4847-bbb2-f8903e9865f9 | networker-1.redhat.local | True | :-) | standby | | 41b8d1a8-4695-445a-916a-d12db523eb91 | controller-0.redhat.local | True | :-) | standby | | 4533bd88-d2d1-4320-9e39-6fcb2a5cc236 | networker-0.redhat.local | True | :-) | standby | +--------------------------------------+---------------------------+----------------+-------+----------+ (overcloud) [stack@undercloud-0 ~]$ Steps to reproduce: 1. for i in $(seq 10); do ./create.sh $i; done 3. Check FIP connectivity to detect the error 4. for i in $(seq 10); do ./delete.sh $i; done Scripts: http://paste.openstack.org/show/802777/ Seems to be a race condition between L3 agent and keepalived configuring qg-xxx interface: - /var/log/messages: http://paste.openstack.org/show/802778/ - L3 agent logs: http://paste.openstack.org/show/802779/ When keepalive is setting the qg-xxx interface IP addresses, the interface disappears from udev and reappears again (I still don't know why yet). The log in journalctl looks the same as when a new interface is created. Since [1], the L3 agent controls the GW interface status (up or down). If the L3 agent do not link up the interface, the router namespace won't be able to send/receive any traffic. [1]https://review.opendev.org/q/I8dca2c1a2f8cb467cfb44420f0eea54ca0932b05 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1916024/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp