Public bug reported: Description =========== In Ussuri, when a compute node providing vGPUs (Nvidia GRID in my case) is rebooted, the mdevs for VGPUs is not recreated, and a traceback from libvirt.libvirtError is thrown.
https://paste.ubuntu.com/p/4t4NvTHGd8/ As far as I understand, this should have been fixed in https://review.opendev.org/#/c/715489/ but it seems like it fails even before it tries to recreate the mdev. Expected result =============== Upon host reboot, the mdevs should be recreated and the VMs should be restarted. Actual result ============= nova-compute throws the aforementioned error, the mdevs are not re-created and the VMs is left in an unrecoverable state. Environment =========== # dnf list installed | grep nova openstack-nova-common.noarch 1:21.1.0-2.el8 @centos-openstack-ussuri openstack-nova-compute.noarch 1:21.1.0-2.el8 @centos-openstack-ussuri python3-nova.noarch 1:21.1.0-2.el8 @centos-openstack-ussuri python3-novaclient.noarch 1:17.0.0-1.el8 @centos-openstack-ussuri # dnf list installed | grep libvirt libvirt-bash-completion.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-client.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-config-nwfilter.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-interface.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-network.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-nodedev.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-nwfilter.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-qemu.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-secret.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-core.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-disk.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-gluster.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-iscsi.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-iscsi-direct.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-logical.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-mpath.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-rbd.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-scsi.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-kvm.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-libs.x86_64 6.0.0-25.2.el8 @advanced-virtualization python3-libvirt.x86_64 6.0.0-1.el8 @advanced-virtualization ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1900800 Title: VGPUs is not recreated on host reboot Status in OpenStack Compute (nova): New Bug description: Description =========== In Ussuri, when a compute node providing vGPUs (Nvidia GRID in my case) is rebooted, the mdevs for VGPUs is not recreated, and a traceback from libvirt.libvirtError is thrown. https://paste.ubuntu.com/p/4t4NvTHGd8/ As far as I understand, this should have been fixed in https://review.opendev.org/#/c/715489/ but it seems like it fails even before it tries to recreate the mdev. Expected result =============== Upon host reboot, the mdevs should be recreated and the VMs should be restarted. Actual result ============= nova-compute throws the aforementioned error, the mdevs are not re-created and the VMs is left in an unrecoverable state. Environment =========== # dnf list installed | grep nova openstack-nova-common.noarch 1:21.1.0-2.el8 @centos-openstack-ussuri openstack-nova-compute.noarch 1:21.1.0-2.el8 @centos-openstack-ussuri python3-nova.noarch 1:21.1.0-2.el8 @centos-openstack-ussuri python3-novaclient.noarch 1:17.0.0-1.el8 @centos-openstack-ussuri # dnf list installed | grep libvirt libvirt-bash-completion.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-client.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-config-nwfilter.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-interface.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-network.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-nodedev.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-nwfilter.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-qemu.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-secret.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-core.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-disk.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-gluster.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-iscsi.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-iscsi-direct.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-logical.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-mpath.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-rbd.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-driver-storage-scsi.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-daemon-kvm.x86_64 6.0.0-25.2.el8 @advanced-virtualization libvirt-libs.x86_64 6.0.0-25.2.el8 @advanced-virtualization python3-libvirt.x86_64 6.0.0-1.el8 @advanced-virtualization To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1900800/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

