Public bug reported: Description ===========
If nova-compute is configured with libvirt/images_type = rbd, then instances booted off images with hw_qemu_guest_agent=yes do not invoke the qemu-ga guest-fsfreeze-freeze and guest-fsfreeze-thaw commands, and thus do not guarantee consistent snapshots. They also appear to silently ignore os_require_quiesce=yes. Steps to reproduce =========== The steps to verify whether or not the FIFREEZE and FITHAW ioctls are received by a guest are described in: http://lists.openstack.org/pipermail/openstack-discuss/2019-August/008648.html https://xahteiwi.eu/resources/hints-and-kinks/ftrace-qemu-ga/ Expected result =============== When you perform these described actions on an instance running on a compute node that does *not* set libvirt/images_type = rbd, then the FIFREEZE and FITHAW events are received as expected when the snapshot is created. This occurs irrespective of whether the instance is using boot- from-image, or boot-from-volume. Actual result ============= When you perform these described actions on an instance running on a compute node that *does* set libvirt/images_type = rbd, *and* the instance is set to boot from an image, then no qemu-ga events are received during snapshots at all. The reason appears to be this direct_snapshot() call: https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/driver.py#L2058 This is defined in https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/imagebackend.py#L1055 and it uses RBD functionality only. Importantly, it never interacts with qemu-ga, so it appears to not worry at all about freezing the filesystem. This problem was apparently introduced in https://opendev.org/openstack/nova/commit/824c3706a3ea691781f4fcc4453881517a9e1c55. However, the qemu-guest-agent calls *are* received correctly if the instance is configured to boot from volume. Environment =========== 1. OpenStack release: Rocky (but this issue is present in current master). 2. Hypervisor: libvirt/KVM 3. Storage type: Ceph RBD 4. Networking: Neutron/ML2/OVS Additional information ====================== A detailed discussion of the issue is available at: https://lists.ceph.io/hyperkitty/list/[email protected]/thread/3YQCRO4JP56EDJN5KX5DWW5N2CSBHRHZ/?sort=date ** Affects: nova Importance: Undecided Status: New ** Summary changed: - With libvirt/images_type = rbd, instances ignore hw_qemu_guest_agent=yes + With libvirt/images_type = rbd, ephemeral instances silently ignore hw_qemu_guest_agent=yes -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1841160 Title: With libvirt/images_type = rbd, ephemeral instances silently ignore hw_qemu_guest_agent=yes Status in OpenStack Compute (nova): New Bug description: Description =========== If nova-compute is configured with libvirt/images_type = rbd, then instances booted off images with hw_qemu_guest_agent=yes do not invoke the qemu-ga guest-fsfreeze-freeze and guest-fsfreeze-thaw commands, and thus do not guarantee consistent snapshots. They also appear to silently ignore os_require_quiesce=yes. Steps to reproduce =========== The steps to verify whether or not the FIFREEZE and FITHAW ioctls are received by a guest are described in: http://lists.openstack.org/pipermail/openstack-discuss/2019-August/008648.html https://xahteiwi.eu/resources/hints-and-kinks/ftrace-qemu-ga/ Expected result =============== When you perform these described actions on an instance running on a compute node that does *not* set libvirt/images_type = rbd, then the FIFREEZE and FITHAW events are received as expected when the snapshot is created. This occurs irrespective of whether the instance is using boot-from-image, or boot-from-volume. Actual result ============= When you perform these described actions on an instance running on a compute node that *does* set libvirt/images_type = rbd, *and* the instance is set to boot from an image, then no qemu-ga events are received during snapshots at all. The reason appears to be this direct_snapshot() call: https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/driver.py#L2058 This is defined in https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/imagebackend.py#L1055 and it uses RBD functionality only. Importantly, it never interacts with qemu-ga, so it appears to not worry at all about freezing the filesystem. This problem was apparently introduced in https://opendev.org/openstack/nova/commit/824c3706a3ea691781f4fcc4453881517a9e1c55. However, the qemu-guest-agent calls *are* received correctly if the instance is configured to boot from volume. Environment =========== 1. OpenStack release: Rocky (but this issue is present in current master). 2. Hypervisor: libvirt/KVM 3. Storage type: Ceph RBD 4. Networking: Neutron/ML2/OVS Additional information ====================== A detailed discussion of the issue is available at: https://lists.ceph.io/hyperkitty/list/[email protected]/thread/3YQCRO4JP56EDJN5KX5DWW5N2CSBHRHZ/?sort=date To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1841160/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

