While I totally understand the use case, I think this is a new feature
for performance reasons and not a bug. CLosing it as Wishlist but of
course you can work on it if you wish ;)

** Changed in: nova
   Importance: Undecided => Wishlist

** Changed in: nova
       Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1887377

Title:
  nova does not loadbalance asignmnet of resources on a host based on
  avaiablity of pci device, hugepages or pcpus.

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Nova has supported hugpages, cpu pinning and pci numa affintiy for a very long
  time. since its introduction the advice has always been to create a flavor 
that mimic your
  typeical hardware toplogy. i.e. if all your compute host have 2 numa nodes 
the you should create
  flavor that request 2 numa nodes. for along time operators have ignored this 
advice
  and continued to create singel numa node flavor sighting that after 5+ year 
of hardware venders
  working with VNF vendor to make there product numa aware, vnf often still do 
not optimize
  properly for a multi numa environment.

  as a result many operator still deploy single numa vms although that
  is becoming less common over time.  when you deploy a vm with a single
  numa node today we more or less iterate over the host numa node in
  order and assign the vm to the first numa nodes where it fits. on a
  host without any pci devices whitelisted for openstack management this
  behvaior result in numa nodes being filled linerally form numa 0 to
  numa n. that mean if a host had 100G of hugepage on both numa node 0
  and 1 and you schduled 101 1G singel numa vms to the host, 100 vm
  would spawn on numa0 and 1 vm would spwan on numa node 1.

  that means that the first 100 vms would all contened for cpu resouces
  on the first numa node while the last vm had all of the secound numa
  ndoe to its own use.

  the correct behavior woudl be for nova to round robin asign the vms
  attepmetin to keep the resouce avapiableity  blanced. this will
  maxiumise performance for indivigual vms while pessimisng the
  schduling of large vms on a host.

  to this end a new numa blancing config option (unset, pack or spread)
  should be added and we should sort numa nodes in decending(spread) or
  acending(pack) order based on pMEM, pCPUs, mempages and pci devices in
  that sequence.

  in future release when numa is in placment this sorting will need to
  be done in a weigher that sorts the allocation caindiates based on the
  same pack/spread cirtira.

  i am filing this as a bug not a feature as this will have a
  significant impact for existing deployment that either expected
  https://specs.openstack.org/openstack/nova-
  specs/specs/pike/implemented/reserve-numa-with-pci.html to implement
  this logic already or who do not follow our existing guidance on
  creating flavor that align to the host topology.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1887377/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to