Public bug reported: If for some reason a live migration fails for an instance with an SRIOV port during the '_pre_live_migration' hook. The instance will lose access to the network and leave behind duplicated port bindings on the database.
The instance re-gains connectivity on the source host after a reboot (don't know if there's another way to restore connectivity). As a side effect of this behavior, the pre-live migration cleanup hook also fails with: PCI device 0000:3b:10.0 is in use by driver QEMU [How to reproduce] - Create an environment with SRIOV, (our case uses switchdev[1]) - Create 1 VM - Provoke a failure in the _pre_live_migration process (for example creating a directory /var/lib/nova/instances/<instance id>) - Check the VM's connectivity - Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001 Full-stack trace[2] [Expected] VM connectivity is restored even if it gets a brief disconnection [Observed] VM loses connectivity which is only is restored after the VM status is set to ERROR and the VM is power recycled [1] https://paste.ubuntu.com/p/PzBM7y6Dbr/ [2] https://paste.ubuntu.com/p/ThQmDYtdSS/ ** Affects: neutron Importance: Undecided Status: New ** Description changed: - If for some reason a live migration fails for an instance with an SRIOV port - during the '_pre_live_migration' hook. The instance will lose access to the - network and leave behind duplicated port bindings on the database. + If for some reason a live migration fails for an instance with an SRIOV + port during the '_pre_live_migration' hook. The instance will lose + access to the network and leave behind duplicated port bindings on the + database. - The instance re-gains connectivity on the source host after a reboot (don't - know if there's another way to restore connectivity). As a side effect of this - behavior, the pre-live migration cleanup hook also fails with: + The instance re-gains connectivity on the source host after a reboot + (don't know if there's another way to restore connectivity). As a side + effect of this behavior, the pre-live migration cleanup hook also fails + with: PCI device 0000:3b:10.0 is in use by driver QEMU [How to reproduce] - Create an environment with SRIOV, (our case uses switchdev[1]) - Create 1 VM - Provoke a failure in the _pre_live_migration process (for example creating a directory /var/lib/nova/instances/<instance id>) - Check the VM's connectivity - Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001 + - Create an environment with SRIOV, (our case uses switchdev[1]) + - Create 1 VM + - Provoke a failure in the _pre_live_migration process (for example creating a directory /var/lib/nova/instances/<instance id>) + - Check the VM's connectivity + - Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001 Full-stack trace[2] [Expected] VM connectivity is restored even if it gets a brief disconnection [Observed] VM loses connectivity which is only is restored after the VM status is set to ERROR and the VM is power recycled - - [1] https://paste.ubuntu.com/p/PzBM7y6Dbr/ [2] https://paste.ubuntu.com/p/ThQmDYtdSS/ -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1944619 Title: Instances with SRIOV ports loose access after failed live migrations Status in neutron: New Bug description: If for some reason a live migration fails for an instance with an SRIOV port during the '_pre_live_migration' hook. The instance will lose access to the network and leave behind duplicated port bindings on the database. The instance re-gains connectivity on the source host after a reboot (don't know if there's another way to restore connectivity). As a side effect of this behavior, the pre-live migration cleanup hook also fails with: PCI device 0000:3b:10.0 is in use by driver QEMU [How to reproduce] - Create an environment with SRIOV, (our case uses switchdev[1]) - Create 1 VM - Provoke a failure in the _pre_live_migration process (for example creating a directory /var/lib/nova/instances/<instance id>) - Check the VM's connectivity - Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001 Full-stack trace[2] [Expected] VM connectivity is restored even if it gets a brief disconnection [Observed] VM loses connectivity which is only is restored after the VM status is set to ERROR and the VM is power recycled [1] https://paste.ubuntu.com/p/PzBM7y6Dbr/ [2] https://paste.ubuntu.com/p/ThQmDYtdSS/ To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1944619/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

