Reviewed: https://review.opendev.org/c/openstack/nova/+/602432 Committed: https://opendev.org/openstack/nova/commit/a62dd42c0dbb6b2ab128e558e127d76962738446 Submitter: "Zuul (22348)" Branch: master
commit a62dd42c0dbb6b2ab128e558e127d76962738446 Author: Stephen Finucane <[email protected]> Date: Fri Apr 30 12:51:35 2021 +0100 libvirt: Delegate OVS plug to os-vif os-vif 1.15.0 added the ability to create an OVS port during plugging by specifying the 'create_port' attribute in the 'port_profile' field. By delegating port creation to os-vif, we can rely on it's 'isolate_vif' config option [1] that will temporarily configure the VLAN to 4095 (0xfff), which is reserved for implementation use [2] and is used by neutron to as a dead VLAN [3]. By doing this, we ensure VIFs are plugged securely, preventing guests from accessing other tenants' networks before the neutron OVS agent can wire up the port. This change requires a little dance as part of the live migration flow. Since we can't be certain the destination host has a version of os-vif that supports this feature, we need to use a sentinel to indicate when it does. Typically we would do so with a field in 'LibvirtLiveMigrateData', such as the 'src_supports_numa_live_migration' and 'dst_supports_numa_live_migration' fields used to indicate support for NUMA-aware live migration. However, doing this prevents us backporting this important fix since o.vo changes are not backportable. Instead, we (somewhat evilly) rely on the free-form nature of the 'VIFMigrateData.profile_json' string field, which stores JSON blobs and is included in 'LibvirtLiveMigrateData' via the 'vifs' attribute, to transport this sentinel. This is a hack but is necessary to work around the lack of a free-form "capabilities" style dict that would allow us do backportable fixes to live migration features. Note that this change has the knock on effect of modifying the XML generated for OVS ports: when hybrid plug is false will now be of type 'ethernet' rather than 'bridge' as before. This explains the larger than expected test damage but should not affect users. [1] https://opendev.org/openstack/os-vif/src/tag/2.4.0/vif_plug_ovs/ovs.py#L90-L93 [2] https://en.wikipedia.org/wiki/IEEE_802.1Q#Frame_format [3] https://answers.launchpad.net/neutron/+question/231806 Change-Id: I11fb5d3ada7f27b39c183157ea73c8b72b4e672e Depends-On: Id12486b3127ab4ac8ad9ef2b3641da1b79a25a50 Closes-Bug: #1734320 Closes-Bug: #1815989 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1815989 Title: OVS drops RARP packets by QEMU upon live-migration causes up to 40s ping pause in Rocky Status in neutron: In Progress Status in OpenStack Compute (nova): Fix Released Status in os-vif: Invalid Bug description: This issue is well known, and there were previous attempts to fix it, like this one https://bugs.launchpad.net/neutron/+bug/1414559 This issue still exists in Rocky and gets worse. In Rocky, nova compute, nova libvirt and neutron ovs agent all run inside containers. So far the only simply fix I have is to increase the number of RARP packets QEMU sends after live-migration from 5 to 10. To be complete, the nova change (not merged) proposed in the above mentioned activity does not work. I am creating this ticket hoping to get an up-to-date (for Rockey and onwards) expert advise on how to fix in nova-neutron. For the record, below are the time stamps in my test between neutron ovs agent "activating" the VM port and rarp packets seen by tcpdump on the compute. 10 RARP packets are sent by (recompiled) QEMU, 7 are seen by tcpdump, the 2nd last packet barely made through. openvswitch-agent.log: 2019-02-14 19:00:13.568 73453 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-26129036-b514-4fa0-a39f-a6b21de17bb9 - - - - -] Port 57d0c265-d971-404d-922d-963c8263e6eb updated. Details: {'profile': {}, 'network_qos_policy_id': None, 'qos_policy_id': None, 'allowed_address_pairs': [], 'admin_state_up': True, 'network_id': '1bf4b8e0-9299-485b-80b0-52e18e7b9b42', 'segmentation_id': 648, 'fixed_ips': [ {'subnet_id': 'b7c09e83-f16f-4d4e-a31a-e33a922c0bac', 'ip_address': '10.0.1.4'} ], 'device_owner': u'compute:nova', 'physical_network': u'physnet0', 'mac_address': 'fa:16:3e:de:af:47', 'device': u'57d0c265-d971-404d-922d-963c8263e6eb', 'port_security_enabled': True, 'port_id': '57d0c265-d971-404d-922d-963c8263e6eb', 'network_type': u'vlan', 'security_groups': [u'5f2175d7-c2c1-49fd-9d05-3a8de3846b9c']} 2019-02-14 19:00:13.568 73453 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-26129036-b514-4fa0-a39f-a6b21de17bb9 - - - - -] Assigning 4 as local vlan for net-id=1bf4b8e0-9299-485b-80b0-52e18e7b9b42 tcpdump for rarp packets: [root@overcloud-ovscompute-overcloud-0 nova]# tcpdump -i any rarp -nev tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes 19:00:10.788220 B fa:16:3e:de:af:47 ethertype Reverse ARP (0x8035), length 62: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is fa:16:3e:de:af:47 tell fa:16:3e:de:af:47, length 46 19:00:11.138216 B fa:16:3e:de:af:47 ethertype Reverse ARP (0x8035), length 62: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is fa:16:3e:de:af:47 tell fa:16:3e:de:af:47, length 46 19:00:11.588216 B fa:16:3e:de:af:47 ethertype Reverse ARP (0x8035), length 62: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is fa:16:3e:de:af:47 tell fa:16:3e:de:af:47, length 46 19:00:12.138217 B fa:16:3e:de:af:47 ethertype Reverse ARP (0x8035), length 62: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is fa:16:3e:de:af:47 tell fa:16:3e:de:af:47, length 46 19:00:12.788216 B fa:16:3e:de:af:47 ethertype Reverse ARP (0x8035), length 62: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is fa:16:3e:de:af:47 tell fa:16:3e:de:af:47, length 46 19:00:13.538216 B fa:16:3e:de:af:47 ethertype Reverse ARP (0x8035), length 62: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is fa:16:3e:de:af:47 tell fa:16:3e:de:af:47, length 46 19:00:14.388320 B fa:16:3e:de:af:47 ethertype Reverse ARP (0x8035), length 62: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is fa:16:3e:de:af:47 tell fa:16:3e:de:af:47, length 46 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1815989/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

