Reviewed: https://review.openstack.org/50388 Committed: http://github.com/openstack/neutron/commit/99440a63af5a2c4c2e139036c42db5c64e9495b2 Submitter: Jenkins Branch: milestone-proposed
commit 99440a63af5a2c4c2e139036c42db5c64e9495b2 Author: Ralf Haferkamp <[email protected]> Date: Thu Aug 29 20:50:55 2013 +0200 Avoid race with udev during ovs agent startup After taking down the veth link between the physical bridge and the integration bridge call udevadm settle to wait for any udev events to be completely processed by the operating system before recreating the veth pair. Some distributions (e.g. openSUSE) have udev rules installed by default that call e.g. ifdown <interface> during the remove event. If that is processed after the ovs agent already brought up the veth pair again the veth pair's link will be down after the agent completed startup and networking will be broken for all VM instances. Change-Id: I95520ea96a9804c5261a0c994bbca137535cc37c Closes-Bug: #1218556 (cherry picked from commit 8d88ee7411d43f148b45d0a145fe32a75765a3ac) ** Changed in: neutron Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1218556 Title: veth pair connecting between physical and integration bridge down after ovs agent restart Status in OpenStack Neutron (virtual network service): Fix Released Bug description: Sometimes after restarting the openvswitch-agent the veth pair that connects the physical bridge with the integration bridge doesn't come up correctly. (Which of cause disconnects any running VM instance from the network) # /etc/init.d/openstack-neutron-openvswitch-agent restart # ip addr show [..] 83: phy-br-eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN qlen 1000 link/ether 3a:6c:d6:a4:1c:89 brd ff:ff:ff:ff:ff:ff 84: int-br-eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN qlen 1000 link/ether a2:12:2a:e5:b8:e4 brd ff:ff:ff:ff:ff:ff [..] I was able to reproduce this problem on openSUSE 12.3 and SLES 11. Ubuntu seems to be unaffected by this. Doing a manual "ip link set up dev <device>" on both ends of the veth pair fixes the problem. (until another restarted might bring it back) I think I was able to track this down to a race condition between udev (and its network rules) and the ip commands that the openvswitch-agent during startup. Among other things the agent does this during startup: ip link delete int-br-fixed ip link add int-br-fixed type veth peer name phy-br-fixed ip link set int-br-fixed up ip link set phy-br-fixed up The ip link delete and ip link add command cause several udev events to be fired. However on my system the processing of the udev rules takes so long that the "remove" events are not completely processed before the ip link add command is started. Which causes the interface to be down after the above commands completed. A possible fix for this is to call "udevadm settle" after the ip link delete call. I will upload a draft patch for review shortly. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1218556/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

