[Yahoo-eng-team] [Bug 1488986] [NEW] nova scheduler for race condition

hougangliu Wed, 26 Aug 2015 06:57:10 -0700

Public bug reported:

a) nova compute service updates info of compute-node by run 
update_available_resource every CONF.update_resources_interval(60s by default). 
b) for every scheduler request:
1. select_destinations is called and get all HostStates(if compute-node is 
newer that local hoststate info based on updated_at, update the HostStates with 
the compute info from DB)
2. check if the host resource can meet instance requirement one by one with 
updating the HostState resource iteratively, if yes, send 
build_and_run_instance cast RPC to the corresponding compute node.
3.compute service accept the amqp message and consumed the instance requirement 
and write new compute info into DB.
4.compute try to spawn the instance, once failed, roll back step 3.


My question:
if user set CONF.update_resources_interval 1s, that is, compute node service 
updates compute info into DB every 1s. 
For the case: the user sends multi nova boot request,  and the first boot 
request goes to step 2 and the compute node service runs periodic task 
update_available_resource at the same time. And the second boot request go to 
step 1 and the first request still not goes to step3, so the second boot 
request gets HostStates set without the first instance's consumption and 
scheduler service will schedule a host for it without considering the first 
instance consumption. And the following request repeats.

Can this race condition occur?

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1488986

Title:
  nova scheduler for race condition

Status in OpenStack Compute (nova):
  New

Bug description:
  a) nova compute service updates info of compute-node by run 
update_available_resource every CONF.update_resources_interval(60s by default). 
  b) for every scheduler request:
  1. select_destinations is called and get all HostStates(if compute-node is 
newer that local hoststate info based on updated_at, update the HostStates with 
the compute info from DB)
  2. check if the host resource can meet instance requirement one by one with 
updating the HostState resource iteratively, if yes, send 
build_and_run_instance cast RPC to the corresponding compute node.
  3.compute service accept the amqp message and consumed the instance 
requirement and write new compute info into DB.
  4.compute try to spawn the instance, once failed, roll back step 3.

  My question:
  if user set CONF.update_resources_interval 1s, that is, compute node service 
updates compute info into DB every 1s. 
  For the case: the user sends multi nova boot request,  and the first boot 
request goes to step 2 and the compute node service runs periodic task 
update_available_resource at the same time. And the second boot request go to 
step 1 and the first request still not goes to step3, so the second boot 
request gets HostStates set without the first instance's consumption and 
scheduler service will schedule a host for it without considering the first 
instance consumption. And the following request repeats.

  Can this race condition occur?

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1488986/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1488986] [NEW] nova scheduler for race condition

Reply via email to