Reviewed: https://review.opendev.org/667177 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7a7a223602ca5aa0aca8f65a6ab143f1d8f8ec1b Submitter: Zuul Branch: master
commit 7a7a223602ca5aa0aca8f65a6ab143f1d8f8ec1b Author: Artom Lifshitz <[email protected]> Date: Wed Mar 20 10:38:12 2019 -0400 Revert resize: wait for events according to hybrid plug Since 4817165fc5938a553fafa1a69c6086f9ebe311af, when reverting a resized instance back to the source host, the libvirt driver waits for vif-plugged events when spawning the instance. When called from finish_revert_resize() in the source compute manager, libvirt's finish_revert_migration() does not pass vifs_already_plugged to _create_domain_and_network(), making the latter use the default False value. When the source compute manager calls network_api.migrate_instance_finish() in finish_revert_resize(), this updates the port binding back to the source host. If Neutron is configured to use OVS hybrid plug, it will send the vif-plugged event immediately after completing this request. This happens before the virt driver's finish_revert_migration() method is called. This causes the wait in the libvirt driver to time out because the event is received before Nova starts waiting for it. The neutron ovs l2 agent sends vif-plugged events when two conditions are met. First the port must be bound to the host managed by the l2 agent and second, the agent must have completed configuring the port on ovs. This involves assigning the port a local VLAN for tenant isolation, applying security group rules if required and applying QoS policies or other agent extensions like service function chaining. During the boot process, we bind the port first to the host then plug the interface into ovs which triggers the l2 agent to configure it resulting in the emission of the vif-plugged event. In the revert case, as noted above, since the vif is already plugged on the source node when hybrid-plug is used, binding the port to the source node fulfils the second condition to send the vif-plugged event. Events sent immediately after port binding update are hereafter known as "bind-time" events. For ports that do not use OVS hybrid plug, Neutron will continue to send vif-plugged events only when Nova actually plugs the VIF. These types of events are hereafter known as "plug-time" events. OVS hybrid plug is a per agent setting, so for a particular host, bind-time events are an all-or-nothing thing for the ovs backend: either all VIF_TYPE=ovs ports have them, or no ovs ports have them. In general, a host will only have one network backend. The only exception to this is SR-IOV. SR-IOV is commonly deployed on the same host as other network backends such as OVS or linuxbridge. SR-IOV ports with VNIC_TYPE=direct-physical will always have only bind-time events. If an instance mixes OVS ports with hybrid-plug=False with direct physical ports, it will have both kinds of events. For same host resize reverts we do not update the binding host as the host does not change, as such for same host resize we do not receive bind time events. For same host revert we therefore do not wait for bind time events in the compute manager. This patch adds functions to the NetworkInfo model that return what kinds of events each VIF has. These are then used in the migration revert logic to decide when to wait for external events: in the compute manager, when binding the port, for bind-time events, and/or in libvirt, when plugging the VIFs, for plug-time events. Closes-bug: #1832028 Closes-Bug: #1833902 Co-Authored-By: Sean Mooney <[email protected]> Change-Id: I51673e58fc8d5f051df911630f6d7a928d123a5b ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1832028 Title: revert resize: vif-plugged external event sent too soon if Neutron is using OVS hybrid plug Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) rocky series: In Progress Status in OpenStack Compute (nova) stein series: In Progress Bug description: Description =========== This is all only when Neutron is using OVS with hybrid plugging. When reverting a resized instance back to its original source host, Nova will timeout waiting for the vif-plugged external event, and never finish the revert. This happens because the event is sent by Neutron as soon as Nova updates the port binding to point back to the original source. This happens before the virt driver gets ready to listen for external events, so the event arrives, just too soon, and Nova times out. Steps to reproduce ================== 1. Resize an instance 2. When it's in VERIFY_RESIZE, revert it Expected result =============== Instance reverts correctly. Actual result ============= Instance goes to ERROR. Environment =========== OVS with hybrid plug. Reported in OSP14/Rocky [1], reproduced on master [2] [3]. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1678681 [2] https://review.opendev.org/#/c/660782/ [3] https://review.opendev.org/#/c/653498/ To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1832028/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

