[Yahoo-eng-team] [Bug 1946753] [NEW] scheduler doesn't update weights

Krzysztof Hajduga Tue, 12 Oct 2021 06:41:12 -0700

Public bug reported:

This is for train release, but I don't see that logic has changed since
in this matter.



When scheduling number of instances at the same time weights do not get updated 
for subsequent instances.
Seems like _consume_selected_host() function has no effect as when scheduler 
starts scheduling another instance, at the beginning of this process it gets 
host states directly from compute nodes.
Problem is that host state update at compute only happens once instance starts 
building which seems in many cases to late. Consequence of that is that next 
compute nodes for next instance is weighed with not accurate weights.
Result is that distribution of the VMs accross compute nodes is not as expected.

I managed to reproduce that problem even with creating just two
instances at the same time.

In one test with 50 instances observed 17 instances scheduled based on
weights values same as for first of them.


Below are logs excerpt with comments from nova-scheduler.log to depict what I 
mean.
This example focuses on RamWeigher.
In this case two instances were created at the same time with openstack cli.

First instance is being scheduled
2021-10-11 15:58:18.484 20 DEBUG nova.scheduler.manager 
[req-c068a693-7f03-4a75-b5b0-54f5e34f8340 2ee7a9b8a93c4cb0a12cd2cfab8ecd04 
d3e8e3c73abd4b0fa1d4fc354ee0c3a7 - default default] Starting to schedule for 
instances: ['d95ba6be-7a19-4d70-9280-27a367f7b102'] select_destinations 
/usr/lib/python3.6/site-packages/nova/scheduler/manager.py:133

Selected host for first instance with weights used for that selection
2021-10-11 15:58:18.853 20 DEBUG nova.scheduler.filter_scheduler 
[req-c068a693-7f03-4a75-b5b0-54f5e34f8340 2ee7a9b8a93c4cb0a12cd2cfab8ecd04 
d3e8e3c73abd4b0fa1d4fc354ee0c3a7 - default default] [instance: 
d95ba6be-7a19-4d70-9280-27a367f7b102] Selected host: (vcmp1, vcmp1) ram: 7328MB 
disk: 38912MB io_ops: 0 instances: 0 _consume_selected_host 
/usr/lib/python3.6/site-packages/nova/scheduler/filter_scheduler.py:354

Selected host for first instance with weights updated/reduced by amounts 
allocated for that instance. 
This weights should be used for scheduling next instance. In particular ram: 
6816MB for RAMweigher. 
This log line is result of extra LOG.debug(...) added to the code(notice 
diffrent line number 357 at the end)
2021-10-11 15:58:18.856 20 DEBUG nova.scheduler.filter_scheduler 
[req-c068a693-7f03-4a75-b5b0-54f5e34f8340 2ee7a9b8a93c4cb0a12cd2cfab8ecd04 
d3e8e3c73abd4b0fa1d4fc354ee0c3a7 - default default] [instance: 
d95ba6be-7a19-4d70-9280-27a367f7b102] Selected host after consume_from_request: 
(vcmp1, vcmp1) ram: 6816MB disk: 37888MB io_ops: 1 instances: 1 
_consume_selected_host 
/usr/lib/python3.6/site-packages/nova/scheduler/filter_scheduler.py:357

Second instance is being scheduled
2021-10-11 15:58:19.487 22 DEBUG nova.scheduler.manager 
[req-7c2dff56-2a94-491e-baab-6080524aa592 2ee7a9b8a93c4cb0a12cd2cfab8ecd04 
d3e8e3c73abd4b0fa1d4fc354ee0c3a7 - default default] Starting to schedule for 
instances: ['92e67f44-898b-4a07-a841-b2ffd296d089'] select_destinations 
/usr/lib/python3.6/site-packages/nova/scheduler/manager.py:133

Selected host for second instance with weights used for that selection. 
It can be seen that weight for RAMweigher is 7328MB. Same as for first 
instance. 
Should be 6816MB instead as when just after _consume_selected_host method was 
executed
2021-10-11 15:58:19.772 22 DEBUG nova.scheduler.filter_scheduler 
[req-7c2dff56-2a94-491e-baab-6080524aa592 2ee7a9b8a93c4cb0a12cd2cfab8ecd04 
d3e8e3c73abd4b0fa1d4fc354ee0c3a7 - default default] [instance: 
92e67f44-898b-4a07-a841-b2ffd296d089] Selected host: (vcmp1, vcmp1) ram: 7328MB 
disk: 38912MB io_ops: 0 instances: 0 _consume_selected_host 
/usr/lib/python3.6/site-packages/nova/scheduler/filter_scheduler.py:354

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: nova-scheduler

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1946753

Title:
  scheduler doesn't update weights

Status in OpenStack Compute (nova):
  New

Bug description:
  This is for train release, but I don't see that logic has changed
  since in this matter.

  
  When scheduling number of instances at the same time weights do not get 
updated for subsequent instances.
  Seems like _consume_selected_host() function has no effect as when scheduler 
starts scheduling another instance, at the beginning of this process it gets 
host states directly from compute nodes.
  Problem is that host state update at compute only happens once instance 
starts building which seems in many cases to late. Consequence of that is that 
next compute nodes for next instance is weighed with not accurate weights.
  Result is that distribution of the VMs accross compute nodes is not as 
expected.

  I managed to reproduce that problem even with creating just two
  instances at the same time.

  In one test with 50 instances observed 17 instances scheduled based on
  weights values same as for first of them.

  
  Below are logs excerpt with comments from nova-scheduler.log to depict what I 
mean.
  This example focuses on RamWeigher.
  In this case two instances were created at the same time with openstack cli.

  First instance is being scheduled
  2021-10-11 15:58:18.484 20 DEBUG nova.scheduler.manager 
[req-c068a693-7f03-4a75-b5b0-54f5e34f8340 2ee7a9b8a93c4cb0a12cd2cfab8ecd04 
d3e8e3c73abd4b0fa1d4fc354ee0c3a7 - default default] Starting to schedule for 
instances: ['d95ba6be-7a19-4d70-9280-27a367f7b102'] select_destinations 
/usr/lib/python3.6/site-packages/nova/scheduler/manager.py:133

  Selected host for first instance with weights used for that selection
  2021-10-11 15:58:18.853 20 DEBUG nova.scheduler.filter_scheduler 
[req-c068a693-7f03-4a75-b5b0-54f5e34f8340 2ee7a9b8a93c4cb0a12cd2cfab8ecd04 
d3e8e3c73abd4b0fa1d4fc354ee0c3a7 - default default] [instance: 
d95ba6be-7a19-4d70-9280-27a367f7b102] Selected host: (vcmp1, vcmp1) ram: 7328MB 
disk: 38912MB io_ops: 0 instances: 0 _consume_selected_host 
/usr/lib/python3.6/site-packages/nova/scheduler/filter_scheduler.py:354

  Selected host for first instance with weights updated/reduced by amounts 
allocated for that instance. 
  This weights should be used for scheduling next instance. In particular ram: 
6816MB for RAMweigher. 
  This log line is result of extra LOG.debug(...) added to the code(notice 
diffrent line number 357 at the end)
  2021-10-11 15:58:18.856 20 DEBUG nova.scheduler.filter_scheduler 
[req-c068a693-7f03-4a75-b5b0-54f5e34f8340 2ee7a9b8a93c4cb0a12cd2cfab8ecd04 
d3e8e3c73abd4b0fa1d4fc354ee0c3a7 - default default] [instance: 
d95ba6be-7a19-4d70-9280-27a367f7b102] Selected host after consume_from_request: 
(vcmp1, vcmp1) ram: 6816MB disk: 37888MB io_ops: 1 instances: 1 
_consume_selected_host 
/usr/lib/python3.6/site-packages/nova/scheduler/filter_scheduler.py:357

  Second instance is being scheduled
  2021-10-11 15:58:19.487 22 DEBUG nova.scheduler.manager 
[req-7c2dff56-2a94-491e-baab-6080524aa592 2ee7a9b8a93c4cb0a12cd2cfab8ecd04 
d3e8e3c73abd4b0fa1d4fc354ee0c3a7 - default default] Starting to schedule for 
instances: ['92e67f44-898b-4a07-a841-b2ffd296d089'] select_destinations 
/usr/lib/python3.6/site-packages/nova/scheduler/manager.py:133

  Selected host for second instance with weights used for that selection. 
  It can be seen that weight for RAMweigher is 7328MB. Same as for first 
instance. 
  Should be 6816MB instead as when just after _consume_selected_host method was 
executed
  2021-10-11 15:58:19.772 22 DEBUG nova.scheduler.filter_scheduler 
[req-7c2dff56-2a94-491e-baab-6080524aa592 2ee7a9b8a93c4cb0a12cd2cfab8ecd04 
d3e8e3c73abd4b0fa1d4fc354ee0c3a7 - default default] [instance: 
92e67f44-898b-4a07-a841-b2ffd296d089] Selected host: (vcmp1, vcmp1) ram: 7328MB 
disk: 38912MB io_ops: 0 instances: 0 _consume_selected_host 
/usr/lib/python3.6/site-packages/nova/scheduler/filter_scheduler.py:354

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1946753/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1946753] [NEW] scheduler doesn't update weights

Reply via email to