Public bug reported: This issue impacts current master, stable/rocky and stable/queens.
The first symptom is that we have seen failures of many tests from legacy-tempest-dsvm-networking-bgpvpn-bagpipe since the merge of [1] in neutron code (August 14th). Background: networking-bagpipe code for BGPVPN has a "router fallback" mechanism: in cases where a network is at the same time connected to a Router and associated to a BGPVPN, the traffic sent by a VM to its gateway is redirected to br-mpls to attempt BGPVPN route matching, before eventually being sent, as a fallback, to the neutron netns router if it did no VPN route was matched in br-mpls. For this mechanism to work, a rule is in place in table 91 to override the NORMAL action (which would result in flood/learn) for the traffic destinated to the gateway MAC address, with a higher priority rule that sends the traffic to br-tun instead (br-tun is where the redirection to br-mpls takes place): cookie=0x8b0cf47ac991c941, duration=5371.870s, table=91, n_packets=217, n_bytes=21266, priority=2,reg6=0x18,dl_dst=fa:16:3e:c5:89:72 actions=mod_vlan_vid:24,output:"patch-tun" cookie=0x89f8a81c314f2696, duration=71265.896s, table=91, n_packets=338, n_bytes=27094, priority=1 actions=NORMAL (above, fa:16:3e:c5:89:72 is the gateway MAC address for the network with vlan_id 24) Analysis of the issue: Change [1] replaced some rule that were resubmiting to table 91, with a NORMAL action, resulting in only the first packets (from a conntrack standpoint) to reach table 91. This prevents the redirection of traffic to br-tun,br-mpls. The tricky thing is that the issue does not always occurs: when there is no entry in the MAC leaning table (ovs-appctl fdb/show br-int) for the gateway MAC, the traffic is flooded and eventually reaches br-tun,br-mpls . This explains why some tests, but not all tests, fail. (not also that the tests where no Router is used in the destination network do not seem to fail.) [1] https://review.openstack.org/#/q/Ib6ced838a7ec6d5c459a8475318556001c31bdf ** Affects: networking-bagpipe Importance: High Status: Confirmed ** Affects: neutron Importance: Undecided Status: New ** Also affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1789878 Title: bgpvpn router fallback broken by change in neutron openvswitch firewall Status in BaGPipe: Confirmed Status in neutron: New Bug description: This issue impacts current master, stable/rocky and stable/queens. The first symptom is that we have seen failures of many tests from legacy-tempest-dsvm-networking-bgpvpn-bagpipe since the merge of [1] in neutron code (August 14th). Background: networking-bagpipe code for BGPVPN has a "router fallback" mechanism: in cases where a network is at the same time connected to a Router and associated to a BGPVPN, the traffic sent by a VM to its gateway is redirected to br-mpls to attempt BGPVPN route matching, before eventually being sent, as a fallback, to the neutron netns router if it did no VPN route was matched in br-mpls. For this mechanism to work, a rule is in place in table 91 to override the NORMAL action (which would result in flood/learn) for the traffic destinated to the gateway MAC address, with a higher priority rule that sends the traffic to br-tun instead (br-tun is where the redirection to br-mpls takes place): cookie=0x8b0cf47ac991c941, duration=5371.870s, table=91, n_packets=217, n_bytes=21266, priority=2,reg6=0x18,dl_dst=fa:16:3e:c5:89:72 actions=mod_vlan_vid:24,output:"patch-tun" cookie=0x89f8a81c314f2696, duration=71265.896s, table=91, n_packets=338, n_bytes=27094, priority=1 actions=NORMAL (above, fa:16:3e:c5:89:72 is the gateway MAC address for the network with vlan_id 24) Analysis of the issue: Change [1] replaced some rule that were resubmiting to table 91, with a NORMAL action, resulting in only the first packets (from a conntrack standpoint) to reach table 91. This prevents the redirection of traffic to br-tun,br-mpls. The tricky thing is that the issue does not always occurs: when there is no entry in the MAC leaning table (ovs-appctl fdb/show br-int) for the gateway MAC, the traffic is flooded and eventually reaches br-tun,br-mpls . This explains why some tests, but not all tests, fail. (not also that the tests where no Router is used in the destination network do not seem to fail.) [1] https://review.openstack.org/#/q/Ib6ced838a7ec6d5c459a8475318556001c31bdf To manage notifications about this bug go to: https://bugs.launchpad.net/networking-bagpipe/+bug/1789878/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

