Public bug reported:
Description
-----------
When the network for an rbd (RADOS Block Device) storage disconnects due to a
failure, `get_power_state` becomes blocked when attempting to query the power
state of a virtual machine. The goal is to check the power status and migrate
online VMs. However, when the periodic monitoring program `domstats` hangs
while accessing the disconnected storage, it causes libvirt's rpc-worker to be
occupied for extended periods. In scenarios with multiple virtual machines,
querying the power status interface also gets delayed and cannot be executed
immediately.
Steps to reproduce
------------------
1. Disconnect the network for the rbd storage.
2. Schedule `domstats` to run every 10 seconds.
Expected result
---------------
The expected outcome is to switch to a higher-priority interface within
libvirt, such as using `domain.state()` possibly in conjunction with a priority
RPC mechanism like `prio-rpc`. This would ensure that critical operations,
including querying power states and conducting necessary migrations, are
prioritized and can still be executed promptly even under resource-constrained
conditions.
** Affects: nova
Importance: Undecided
Assignee: Yalei Li (chetaiyong)
Status: New
** Changed in: nova
Assignee: (unassigned) => Yalei Li (chetaiyong)
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2048848
Title:
get_power_state blocked
Status in OpenStack Compute (nova):
New
Bug description:
Description
-----------
When the network for an rbd (RADOS Block Device) storage disconnects due to a
failure, `get_power_state` becomes blocked when attempting to query the power
state of a virtual machine. The goal is to check the power status and migrate
online VMs. However, when the periodic monitoring program `domstats` hangs
while accessing the disconnected storage, it causes libvirt's rpc-worker to be
occupied for extended periods. In scenarios with multiple virtual machines,
querying the power status interface also gets delayed and cannot be executed
immediately.
Steps to reproduce
------------------
1. Disconnect the network for the rbd storage.
2. Schedule `domstats` to run every 10 seconds.
Expected result
---------------
The expected outcome is to switch to a higher-priority interface within
libvirt, such as using `domain.state()` possibly in conjunction with a priority
RPC mechanism like `prio-rpc`. This would ensure that critical operations,
including querying power states and conducting necessary migrations, are
prioritized and can still be executed promptly even under resource-constrained
conditions.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2048848/+subscriptions
--
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help : https://help.launchpad.net/ListHelp