Reviewed:  https://review.opendev.org/662522
Committed: 
https://git.openstack.org/cgit/openstack/nova/commit/?id=c29f382f69ed0bfe2c8782913e2a882344ff461f
Submitter: Zuul
Branch:    master

commit c29f382f69ed0bfe2c8782913e2a882344ff461f
Author: Stephen Finucane <[email protected]>
Date:   Fri Jun 7 16:57:31 2019 +0100

    Recalculate 'RequestSpec.numa_topology' on resize
    
    When resizing, it's possible to change the NUMA topology of an instance,
    or remove it entirely, due to different extra specs in the new flavor.
    Unfortunately we cache the instance's NUMA topology object in
    'RequestSpec.numa_topology' and don't update it when resizing. This
    means if a given host doesn't have enough free CPUs or mempages of the
    size requested by the *old* flavor, that host can be rejected by the
    filter.
    
    Correct this by regenerating the 'RequestSpec.numa_topology' field as
    part of the resize operation, ensuring that we revert to the old field
    value in the case of a resize-revert.
    
    Change-Id: I0ca50665b86b9fdb4618192d4d6a3bcaa6ea2291
    Signed-off-by: Stephen Finucane <[email protected]>
    Co-Authored-By: He Jie Xu <[email protected]>
    Closes-bug: #1805767


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1805767

Title:
  The new numa topology in the new flavor extra specs weren't parsed
  when resize

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Env:
  host with 2 numa nodes.

  flavor n2 request two instance numa nodes
  flavor n3 request three instance numa nodes

  Reproduce:
  Boot an instance with flavor n2, which scheduled to the host.
  Resize the instance with n3.

  
  The scheduler logs:
  Nov 28 18:27:16 jfz1r03h15 nova-scheduler[47260]: DEBUG nova.virt.hardware 
[None req-953d07bf-8ead-4f21-bd64-1ab12244eec1 admin admin] Attempting to fit 
instance cell 
InstanceNUMACell(cpu_pinning_raw=None,cpu_policy=None,cpu_thread_policy=None,cpu_topology=<?>,cpuset=set([0]),cpuset_reserved=None,id=0,memory=256,pagesize=None)
 on host_cell 
NUMACell(cpu_usage=1,cpuset=set([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53]),id=0,memory=128835,memory_usage=256,mempages=[NUMAPagesTopology,NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pinned_cpus=set([]),siblings=[set([43,7]),set([16,52]),set([2,38]),set([8,44]),set([50,14]),set([0,36]),set([51,15]),set([1,37]),set([10,46]),set([11,47]),set([42,6]),set([41,5]),set([9,45]),set([3,39]),set([48,12]),set([49,13]),set([17,53]),set([40,4])])
 {{(pid=48606) _numa_fit_instance_cell 
/opt/stack/nova/nova/virt/hardware.py:1019}}

  
  Nov 28 18:27:16 jfz1r03h15 nova-scheduler[47260]: DEBUG nova.virt.hardware 
[None req-953d07bf-8ead-4f21-bd64-1ab12244eec1 admin admin] Attempting to fit 
instance cell 
InstanceNUMACell(cpu_pinning_raw=None,cpu_policy=None,cpu_thread_policy=None,cpu_topology=<?>,cpuset=set([1]),cpuset_reserved=None,id=1,memory=256,pagesize=None)
 on host_cell 
NUMACell(cpu_usage=1,cpuset=set([18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71]),id=1,memory=129009,memory_usage=256,mempages=[NUMAPagesTopology,NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pinned_cpus=set([]),siblings=[set([59,23]),set([65,29]),set([18,54]),set([34,70]),set([24,60]),set([33,69]),set([58,22]),set([67,31]),set([66,30]),set([26,62]),set([35,71]),set([57,21]),set([25,61]),set([19,55]),set([64,28]),set([32,68]),set([27,63]),set([56,20])])
 {{(pid=48606) _numa_fit_instance_cell 
/opt/stack/nova/nova/virt/hardware.py:1019}}

  
  As above, the scheduler only see two instance numa nodes. It means the new 
flavor extra specs weren't parsed.

  The nova-compute log:
  Nov 28 18:27:27 jfz1r03h15 nova-scheduler[47260]: DEBUG 
oslo_service.periodic_task [None req-7aff4535-fe99-48b4-bab9-d206d35412ff None 
None] Running periodic task SchedulerManager._run_periodic_tasks {{(pid=48606) 
run_periodic_tasks 
/usr/local/lib/python2.7/dist-packages/oslo_service/periodic_task.py:219}}
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     result = func(ctxt, **new_args)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File "/opt/stack/nova/nova/exception_wrapper.py", 
line 79, in wrapped
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     function_name, call_dict, binary, tb)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     self.force_reraise()
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File "/opt/stack/nova/nova/exception_wrapper.py", 
line 69, in wrapped
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     return f(self, context, *args, **kw)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File "/opt/stack/nova/nova/compute/manager.py", 
line 187, in decorated_function
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     "Error: %s", e, instance=instance)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     self.force_reraise()
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File "/opt/stack/nova/nova/compute/manager.py", 
line 157, in decorated_function
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     return function(self, context, *args, **kwargs)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File "/opt/stack/nova/nova/compute/utils.py", line 
1157, in decorated_function
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     return function(self, context, *args, **kwargs)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File "/opt/stack/nova/nova/compute/manager.py", 
line 215, in decorated_function
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     kwargs['instance'], e, sys.exc_info())
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     self.force_reraise()
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File "/opt/stack/nova/nova/compute/manager.py", 
line 203, in decorated_function
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     return function(self, context, *args, **kwargs)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File "/opt/stack/nova/nova/compute/manager.py", 
line 4234, in prep_resize
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     filter_properties, host_list)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File "/opt/stack/nova/nova/compute/manager.py", 
line 4298, in _reschedule_resize_or_reraise
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     six.reraise(*exc_info)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File "/opt/stack/nova/nova/compute/manager.py", 
line 4224, in prep_resize
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     node, migration, clean_shutdown)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File "/opt/stack/nova/nova/compute/manager.py", 
line 4184, in _prep_resize
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     limits=limits) as claim:
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File 
"/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 
328, in inner
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     return f(*args, **kwargs)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File 
"/opt/stack/nova/nova/compute/resource_tracker.py", line 258, in resize_claim
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     limits=limits)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File 
"/opt/stack/nova/nova/compute/resource_tracker.py", line 327, in _move_claim
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     limits=limits)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File "/opt/stack/nova/nova/compute/claims.py", line 
275, in __init__
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     overhead=overhead, limits=limits)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File "/opt/stack/nova/nova/compute/claims.py", line 
95, in __init__
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     self._claim_test(resources, limits)
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server   File "/opt/stack/nova/nova/compute/claims.py", line 
162, in _claim_test
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server     "; ".join(reasons))
  Nov 28 18:27:18 jfz1r03h15 nova-compute[52929]: ERROR 
oslo_messaging.rpc.server ComputeResourcesUnavailable: Insufficient compute 
resources: Requested instance NUMA topology cannot fit the given host NUMA 
topology.

  The host only has two numa nodes, but the request resizing to three
  instance numa node still schedule to the host. And the move claim
  refused the request.

  Expect:
  The request of instance numa nodes should be updated with new flavor. And the 
scheduling should make a right scheduling in the beginning.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1805767/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to