Public bug reported: Description:
The resource tracker update_available_resource is removing an instance from the provider_tree if a call to placement aggregates end's in 409. https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L1221 This is different to the catch logic in the other calls to traits and inventories for example here where we do not perform any cache removals: https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L997-L1008 The race happens when a instance is created on the same provider via https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2631 https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L240 When a 409 occurs and the instance is removed from the cache we see unintended behaviour below. Nova attempts to delete the resource provider as its no longer in the cache https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L1485 and we have also seen API logs of Nova also attempting and sometimes succeeding to set aggregates to []. However this is harder to pin down. https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L1504 https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L1183 This bug is filed in conjunction with <> which provides a similar improvement to prevent the resource tracker trying to update the provider if it doesn't need too. Steps to reproduce: Are quite difficult given its a race condition. However the steps involve creating a new server at the same time as the resource tracker triggering update_available_resource and hitting the server in its loop. Fix: Remove the removal code here https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L1217-L1224 ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2117697 Title: Race condition between resource tracker and server create Status in OpenStack Compute (nova): New Bug description: Description: The resource tracker update_available_resource is removing an instance from the provider_tree if a call to placement aggregates end's in 409. https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L1221 This is different to the catch logic in the other calls to traits and inventories for example here where we do not perform any cache removals: https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L997-L1008 The race happens when a instance is created on the same provider via https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2631 https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L240 When a 409 occurs and the instance is removed from the cache we see unintended behaviour below. Nova attempts to delete the resource provider as its no longer in the cache https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L1485 and we have also seen API logs of Nova also attempting and sometimes succeeding to set aggregates to []. However this is harder to pin down. https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L1504 https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L1183 This bug is filed in conjunction with <> which provides a similar improvement to prevent the resource tracker trying to update the provider if it doesn't need too. Steps to reproduce: Are quite difficult given its a race condition. However the steps involve creating a new server at the same time as the resource tracker triggering update_available_resource and hitting the server in its loop. Fix: Remove the removal code here https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L1217-L1224 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/2117697/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp