** Also affects: nova/rocky
Importance: Undecided
Status: New
** Changed in: nova/rocky
Status: New => In Progress
** Changed in: nova/rocky
Importance: Undecided => Medium
** Changed in: nova/rocky
Assignee: (unassigned) => Stephen Finucane (stephenfinucane)
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1810977
Title:
Oversubscription broken for instances with NUMA topologies
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) rocky series:
In Progress
Bug description:
As described in [1], the fix to [2] appears to have inadvertently
broken oversubscription of memory for instances with a NUMA topology
but no hugepages.
Steps to reproduce:
1. Create a flavor that will consume > 50% available memory for your
host(s) and specify an explicit NUMA topology. For example, on my all-
in-one deployment where the host has 32GB RAM, we will request a 20GB
instance:
$ openstack flavor create --vcpu 2 --disk 0 --ram 20480 test.numa
$ openstack flavor set test.numa --property hw:numa_nodes=2
2. Boot an instance using this flavor:
$ openstack server create --flavor test.numa --image
cirros-0.3.6-x86_64-disk --wait test
3. Boot another instance using this flavor:
$ openstack server create --flavor test.numa --image
cirros-0.3.6-x86_64-disk --wait test2
# Expected result:
The second instance should boot.
# Actual result:
The second instance fails to boot. We see the following error message
in the logs.
nova-scheduler[18295]: DEBUG nova.virt.hardware [None
req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] No specific pagesize
requested for instance, selected pagesize: 4 {{(pid=18318)
_numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1045}}
nova-scheduler[18295]: DEBUG nova.virt.hardware [None
req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] Not enough available
memory to schedule instance with pagesize 4. Required: 10240, available: 5676,
total: 15916. {{(pid=18318) _numa_fit_instance_cell
/opt/stack/nova/nova/virt/hardware.py:1055}}
If we revert the patch that addressed the bug [3] then we revert to
the correct behaviour and the instance boots. With this though, we
obviously lose whatever benefits that change gave us.
[1]
http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001459.html
[2] https://bugs.launchpad.net/nova/+bug/1734204
[3] https://review.openstack.org/#/c/532168
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1810977/+subscriptions
--
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help : https://help.launchpad.net/ListHelp