Public bug reported: The problem is inside this function - https://github.com/openstack/nova/blob/master/nova/scheduler/filters/ram_filter.py#L33
Probably related to https://bugs.launchpad.net/nova/+bug/1635367 The problem is that RAMFilter calculations do not take into account VM RAM subscription. This causes scheduler to try spawning VMs on hosts which a fully oversubscribed while still have some physical free RAM - this is possible due to KSM for example. Consider this scenario: ram_allocation_ratio = 1.5 Some compute host has 10GB physical RAM and 15 1GB VMs already spawned on it. At the same time, there is still 2GB free physical RAM on the host, as seen in "free -m" and in nova hypervisor-show. A new VM is scheduled and RAMFilter is executed: requested_ram = spec_obj.memory_mb = 1GB free_ram_mb = host_state.free_ram_mb = 2GB # this is actual free RAM on a host, which does not properly reflect VM subscription total_usable_ram_mb = host_state.total_usable_ram_mb = 10GB # host has 10GB RAM total Then the main check which is performed is: memory_mb_limit = total_usable_ram_mb * ram_allocation_ratio = 15GB used_ram_mb = total_usable_ram_mb - free_ram_mb = 10 - 2 = 8GB usable_ram = memory_mb_limit - used_ram_mb = 15 - 8 = 7GB # incorrect assumption that host has 7GB usable RAM left Unless I have some incorrect understanding, the logic here is broken. At first I tried to make up a quick fix, but then realized the VM subscription RAM value (sum of RAM of all VMs scheduled on a host) is not present in this code so proper calculation cannot be done. It may be available inside host_state object, I have not checked yet. ** Affects: nova Importance: Undecided Status: New ** Tags: sche -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1736224 Title: ram_allocation_ratio > 1 causes RAMFIlter to incorrectly decide on ability so spawn instance Status in OpenStack Compute (nova): New Bug description: The problem is inside this function - https://github.com/openstack/nova/blob/master/nova/scheduler/filters/ram_filter.py#L33 Probably related to https://bugs.launchpad.net/nova/+bug/1635367 The problem is that RAMFilter calculations do not take into account VM RAM subscription. This causes scheduler to try spawning VMs on hosts which a fully oversubscribed while still have some physical free RAM - this is possible due to KSM for example. Consider this scenario: ram_allocation_ratio = 1.5 Some compute host has 10GB physical RAM and 15 1GB VMs already spawned on it. At the same time, there is still 2GB free physical RAM on the host, as seen in "free -m" and in nova hypervisor-show. A new VM is scheduled and RAMFilter is executed: requested_ram = spec_obj.memory_mb = 1GB free_ram_mb = host_state.free_ram_mb = 2GB # this is actual free RAM on a host, which does not properly reflect VM subscription total_usable_ram_mb = host_state.total_usable_ram_mb = 10GB # host has 10GB RAM total Then the main check which is performed is: memory_mb_limit = total_usable_ram_mb * ram_allocation_ratio = 15GB used_ram_mb = total_usable_ram_mb - free_ram_mb = 10 - 2 = 8GB usable_ram = memory_mb_limit - used_ram_mb = 15 - 8 = 7GB # incorrect assumption that host has 7GB usable RAM left Unless I have some incorrect understanding, the logic here is broken. At first I tried to make up a quick fix, but then realized the VM subscription RAM value (sum of RAM of all VMs scheduled on a host) is not present in this code so proper calculation cannot be done. It may be available inside host_state object, I have not checked yet. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1736224/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

