Reviewed: https://review.opendev.org/c/openstack/neutron/+/839671 Committed: https://opendev.org/openstack/neutron/commit/5288593fafe6636fc14b8873465866d20de26935 Submitter: "Zuul (22348)" Branch: master
commit 5288593fafe6636fc14b8873465866d20de26935 Author: Damian Dabrowski <[email protected]> Date: Thu Apr 28 02:54:25 2022 +0200 [L3-HA] Disable automatic link-local address assignment for HA routers In order to get both [1] and [2] fixed, we set `net.ipv6.conf.all.addr_gen_mode=1` in HA router namespace to prevent auto-assigning link-local address(lla) to the interfaces. We don't need lla auto-assignment as keepalived manages them. With this change, we will have link-local addresses only on active router, which will prevent 'dadfailed' and MLD packets will not be sent from standby router. Previously we also reverted [3] to always keep qg-* interface up on both active&standby router's instance, no matter if keepalived is started or not. Without link-local address assigned, backup router's instance won't send any packets, so I see no reason to keep qg-* interface down. [1] https://bugs.launchpad.net/neutron/+bug/1952907 [2] https://bugs.launchpad.net/neutron/+bug/1859832 [3] https://review.opendev.org/c/openstack/neutron/+/834162 Closes-Bug: #1952907 Related-Bug: #1859832 Depends-On: https://review.opendev.org/c/openstack/neutron/+/834162 Change-Id: I306f14aa6b7e8bb69a81f441be337bc1a584d3b2 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1952907 Title: Gratuitous ARPs are not sent during master transition Status in neutron: Fix Released Bug description: * High level description: When a router transitions to MASTER state, keepalived should send GARPs but it fails because qg-* interface is down(it comes up about 1 sec after that, so it might be some race condition) Keepalived should also send another GARPs after 60 seconds(garp_master_delay) but it doesn't(probably because first ones fail, but I'm not 100% sure). When I add random port to this router to trigger keepalived's reload, then all GARPs are sent properly(because netns is already configured and qg-* interface is up for the whole time) * Pre-conditions: Operating System: Ubuntu 20.04 Keepalived version: 2.0.19 Affected neutron releases: - my AIO env: Xena (master/106fa3e6d3f0b1c32ef28fe9dd6b125b9317e9cf # HEAD as of 29.09.2021) - my prod env: Victoria - (most likely all versions after this change https://review.opendev.org/c/openstack/neutron/+/707406) * Step-by-step reproduction: Simply perform a failover on HA router. The same goal may be also achieved by removing all l3 agents from the router, and then adding one, so: # openstack router create neutron-bug --ha # openstack router set --external-gateway public neutron-bug # neutron l3-agent-list-hosting-router neutron-bug # (for all l3 agents): neutron l3-agent-router-remove L3_AGENT_ID neutron-bug # (for a single l3 agent): neutron l3-agent-router-add L3_AGENT_ID neutron-bug (GARPs are not sent) # openstack router add port neutron-bug test-port (GARPs are sent properly) * Expected output: Gratuitous ARPs should be sent from router's namespace during MASTER transition. * Actual output: Gratuitous ARPs are not sent. Keepalived complains about: Error 100 (Network is down) sending gratuitous ARP on qg-4a2f0239-5c for 172.29.249.194 qg-* interface wakes up about 1 second after keepalived tries to send GARPs. * Root cause Currently neutron keeps qg- interface down for BACKUP agents: https://review.opendev.org/c/openstack/neutron/+/707406 Keepalived's MASTER transition takes place before keepalived-state-change notifies neutron-l3-agent about state change. As a result, neutron-l3-agent links qg- interface after keepalived's MASTER transition, which simply means that keepalived can't send GARPs during this transition, because qg- interface is down then. * Proposed solutions 1. Revert https://review.opendev.org/c/openstack/neutron/+/707406 and always keep qg- interfaces up I'm not sure, but maybe we don't need above change anymore because it was fixed in keepalived: https://github.com/acassen/keepalived/commit/b10bbfc2a2b216487cea5a586c55765275e41253 2. Send delayed GARPs by keepalived_state_change.py Change proposal: https://review.opendev.org/c/openstack/neutron/+/821433 3. Send GARPs also for FIPs(like it's done for non-HA routers by ./agent/l3/legacy_router.py) Change proposal: https://review.opendev.org/c/openstack/neutron/+/821434 P.S. As solutions 2. and 3. only sends GARPs, we may also need to fix IPv6's NDP. Besides ARPs, keepalived also fails to send unsolicited neighbor advertisements. I'm not sure about it though, I don't know much about IPv6. * Attachments: Keepalived logs: https://paste.openstack.org/raw/811372/ Interfaces inside router's netns + tcpdump from master transition: https://paste.openstack.org/raw/811373/ To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1952907/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

