What do you mean by "adding hosts continuously" ? You mean issuing those 
commands at once ?
Looks to me that 
https://github.com/openstack/nova/blob/0258cecaca88d4a305e99c5a17e2230361ef1235/nova/compute/api.py#L5050-L5062
 could be racy if we have multiple API workers that fetch simultaneously the 
aggregate information and try to update it.

We could make that more resilient and adding more distributed locking
mechanism, but since the aggregates API is admin-only (and adding a host
is something not done often - in comparison to an end-user API call for
example), I leave the question open whether the solution complications
would overcome the benefits.

** Tags added: sched

** Tags removed: sched
** Tags added: availability-zones openstack-version.pike

** Changed in: nova
       Status: New => Opinion

** Changed in: nova
   Importance: Undecided => Low

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1747622

Title:
  Aggregate info in nova_scheduler lose some host when add some host in
  aggregate continuousely

Status in OpenStack Compute (nova):
  Opinion

Bug description:
  Description
  ===========
  If add some host to an availability_zone continuously, nova_scheduler's 
aggs_by_id and host_aggregates_map may be lost some host aggregate data. Then 
create instance in this availability_zone will not select those lost host every 
time.

  Steps to reproduce
  ==================
  1.create an availability_zone.
  nova aggregate-create test3 test3

  2.add host to this availability_zone continuously.
  nova aggregate-add-host 51 Computer0102
  nova aggregate-add-host 51 Computer0103
  nova aggregate-add-host 51 Computer0116

  3.create instances in this availability_zone.

  Expected result
  ===============
  Instances can select Computer0102, Computer0103 and Computer0116.

  Actual result
  =============
  Instance never select Computer0103.

  Environment
  ===========
  1. Exact version of OpenStack you are running. See the following
  Pike

  2. Which hypervisor did you use?
  Libvirt + KVM

  Logs & Configs
  =============
  I add some log in nova-scheduler's host_manager, find aggregate lose 
information in nova-api when add host continuously.

  [root@Controller01 ~]# cat /var/log/nova/nova-scheduler.log | grep hanrong 
|grep update_aggregates
  2018-02-06 11:02:43.412 38000 INFO nova.scheduler.host_manager 
[req-69cb0a45-96f9-4693-91e8-46aeaec4ff54 9974cc9acecb40f3827c3b27e803e87c 
04c742a5ce41488494f5b0d587a9bd32 - default default] hanrong update_aggregates: 
[Aggregate(created_at=2018-02-06T03:02:43Z,deleted=False,deleted_at=None,hosts=[],id=51,metadata={},name='test3',updated_at=None,uuid=5a43dd9c-aa85-4c29-a4ce-bbf58af1c150)]
  2018-02-06 11:02:52.187 38000 INFO nova.scheduler.host_manager 
[req-b0582d0d-59fd-4a58-85ab-ab13116bec40 9974cc9acecb40f3827c3b27e803e87c 
04c742a5ce41488494f5b0d587a9bd32 - default default] hanrong update_aggregates: 
[Aggregate(created_at=2018-02-06T03:02:43Z,deleted=False,deleted_at=None,hosts=['Computer0102'],id=51,metadata={availability_zone='test3'},name='test3',updated_at=None,uuid=5a43dd9c-aa85-4c29-a4ce-bbf58af1c150)]
  2018-02-06 11:02:52.239 38000 INFO nova.scheduler.host_manager 
[req-eae376aa-f725-4b87-8740-df58c0bb25de 9974cc9acecb40f3827c3b27e803e87c 
04c742a5ce41488494f5b0d587a9bd32 - default default] hanrong update_aggregates: 
[Aggregate(created_at=2018-02-06T03:02:43Z,deleted=False,deleted_at=None,hosts=['Computer0102','Computer0103'],id=51,metadata={availability_zone='test3'},name='test3',updated_at=None,uuid=5a43dd9c-aa85-4c29-a4ce-bbf58af1c150)]
  2018-02-06 11:02:52.247 38000 INFO nova.scheduler.host_manager 
[req-22a5740f-6560-4603-8904-509b39335a76 9974cc9acecb40f3827c3b27e803e87c 
04c742a5ce41488494f5b0d587a9bd32 - default default] hanrong update_aggregates: 
[Aggregate(created_at=2018-02-06T03:02:43Z,deleted=False,deleted_at=None,hosts=['Computer0102','Computer0116'],id=51,metadata={availability_zone='test3'},name='test3',updated_at=None,uuid=5a43dd9c-aa85-4c29-a4ce-bbf58af1c150)]

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1747622/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to