Public bug reported: Nova's resource tracker is expected to publish negative values to the scheduler when resources are overcommitted. Nova's scheduler expects this:
https://github.com/openstack/nova/blob/a43dbba2b8feea063ed2d0c79780b4c3507cf89b/nova/scheduler/host_manager.py#L215 In change https://review.openstack.org/#/c/306670, these values were filtered to never drop below zero, which is incorrect. That change was making a complex alteration for ironic and cells, specifically to avoid resources from ironic nodes showing up as negative when they were unavailable. That was a cosmetic fix (which I believe has been corrected for ironic only in this patch: https://review.openstack.org/#/c/230487/ Regardless, since the scheduler does the same calculation to determine available resources on the node, if the node reports 0 when the scheduler calculates -100 for a given resource, the scheduler will assume the node till has room (due to oversubscription) and will send builds there destined to fail. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1698383 Title: Resource tracker regressed reporting negative memory Status in OpenStack Compute (nova): New Bug description: Nova's resource tracker is expected to publish negative values to the scheduler when resources are overcommitted. Nova's scheduler expects this: https://github.com/openstack/nova/blob/a43dbba2b8feea063ed2d0c79780b4c3507cf89b/nova/scheduler/host_manager.py#L215 In change https://review.openstack.org/#/c/306670, these values were filtered to never drop below zero, which is incorrect. That change was making a complex alteration for ironic and cells, specifically to avoid resources from ironic nodes showing up as negative when they were unavailable. That was a cosmetic fix (which I believe has been corrected for ironic only in this patch: https://review.openstack.org/#/c/230487/ Regardless, since the scheduler does the same calculation to determine available resources on the node, if the node reports 0 when the scheduler calculates -100 for a given resource, the scheduler will assume the node till has room (due to oversubscription) and will send builds there destined to fail. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1698383/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

