Public bug reported:

Description
===========
According to https://bugs.launchpad.net/nova/+bug/1756179, Currently we delete 
a nova-compute service, will delete compute_node records, resource provider 
records and host mapping records in DB. I found if deleting service when 
nova-compute service is active, it's no problem for deleting compute_node 
records and resource_provider records in DB, but nova-compute will continue to 
report the old resource_provider uuid. So when we restart nova-compute to 
recover service, will rasie ResourceProviderCreationFailed.


Steps to reproduce
==================
1. Check enviroment and resource_provider table.
# nova service-list | grep 'nova-compute'
| 3d9092b0-e164-4094-8672-1c855971218d | nova-compute     | devstack-q | nova   
  | enabled | up    |
MariaDB [placement]> select uuid,name from resource_providers;
+--------------------------------------+------------+
| uuid                                 | name       |
+--------------------------------------+------------+
| edfff022-c19f-4720-85f9-fd947ae36b07 | devstack-q |
+--------------------------------------+------------+

2. Deleting a compute service when nova-compute process is running, check 
resource_provider table.
# nova service-delete 3d9092b0-e164-4094-8672-1c855971218d
MariaDB [placement]> select * from resource_providers;
Empty set (0.00 sec)

3. Wait a minute, restart nova-compute process.
# systemctl restart devstack@n-cpu


Expected result
===============
nova-compute work properly and report to resource_provider with new uuid.


Actual result
===============
nova-compute raise 409 when creae a new uuid resource_provider, and report 'No 
resource provider with uuid 52943fd2-d700-416f-9e16-7fe4744979b3 found'.


I found if nova-compute running, it will resume the old uuid to 
resource_providers when this uuid is gone. So
current resource_provider uuid in DB is still 
'edfff022-c19f-4720-85f9-fd947ae36b07'. Then nova-compute will try to create a 
new resource provider with name 'devstack-q'. Unfortunately, the name column in 
tables is unique.

So I think we should check compute_id existence first, then update
resource_provider_tree. If not exist, rasie ComputeHostNotFound  instead
of reporting.

** Affects: nova
     Importance: Undecided
     Assignee: xulei (605423512-j)
         Status: New


** Tags: placement

** Changed in: nova
     Assignee: (unassigned) => xulei (605423512-j)

** Tags added: placement

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1817833

Title:
  Check compute_id existence when nova-compute reports  info to
  placement

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===========
  According to https://bugs.launchpad.net/nova/+bug/1756179, Currently we 
delete a nova-compute service, will delete compute_node records, resource 
provider records and host mapping records in DB. I found if deleting service 
when nova-compute service is active, it's no problem for deleting compute_node 
records and resource_provider records in DB, but nova-compute will continue to 
report the old resource_provider uuid. So when we restart nova-compute to 
recover service, will rasie ResourceProviderCreationFailed.

  
  Steps to reproduce
  ==================
  1. Check enviroment and resource_provider table.
  # nova service-list | grep 'nova-compute'
  | 3d9092b0-e164-4094-8672-1c855971218d | nova-compute     | devstack-q | nova 
    | enabled | up    |
  MariaDB [placement]> select uuid,name from resource_providers;
  +--------------------------------------+------------+
  | uuid                                 | name       |
  +--------------------------------------+------------+
  | edfff022-c19f-4720-85f9-fd947ae36b07 | devstack-q |
  +--------------------------------------+------------+

  2. Deleting a compute service when nova-compute process is running, check 
resource_provider table.
  # nova service-delete 3d9092b0-e164-4094-8672-1c855971218d
  MariaDB [placement]> select * from resource_providers;
  Empty set (0.00 sec)

  3. Wait a minute, restart nova-compute process.
  # systemctl restart devstack@n-cpu

  
  Expected result
  ===============
  nova-compute work properly and report to resource_provider with new uuid.

  
  Actual result
  ===============
  nova-compute raise 409 when creae a new uuid resource_provider, and report 
'No resource provider with uuid 52943fd2-d700-416f-9e16-7fe4744979b3 found'.

  
  I found if nova-compute running, it will resume the old uuid to 
resource_providers when this uuid is gone. So
  current resource_provider uuid in DB is still 
'edfff022-c19f-4720-85f9-fd947ae36b07'. Then nova-compute will try to create a 
new resource provider with name 'devstack-q'. Unfortunately, the name column in 
tables is unique.

  So I think we should check compute_id existence first, then update
  resource_provider_tree. If not exist, rasie ComputeHostNotFound
  instead of reporting.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1817833/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to