Reviewed: https://review.openstack.org/524263 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=112cd9cd1f31e091920e2f55fb213f78152dfd37 Submitter: Zuul Branch: master
commit 112cd9cd1f31e091920e2f55fb213f78152dfd37 Author: Eric Fried <[email protected]> Date: Thu Nov 30 11:39:11 2017 -0600 Proper error handling by _ensure_resource_provider Previously, if _ensure_resource_provider encountered any error from the placement REST API, it would (sometimes log a message and) return None. Furthermore, a name conflict while creating the provider was treated the same as a UUID conflict, which would actually result in None being returned. With this change set, the error paths that previously returned None now raise one of the new ResourceProviderRetrievalFailed or ResourceProviderCreationFailed exceptions; and the name conflict path is detected and treated as an error condition. Note: This change set only touches the SchedulerReportClient side of these error conditions - it makes no attempt to add error handling to its callers. Case in point, the API samples tests needed fixing because they were previously running into the name conflict error condition, but not noticing. As currently implemented, the new exceptions will percolate up to ComputeManager.update_available_resource_for_node like any others coming from SchedulerReportClient, where they will be logged and ignored. Change-Id: I0c4ca6a81f213277fe7219cb905a805712f81e36 Closes-Bug: #1735430 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1735430 Title: Report client doesn't handle RP create conflict (409) properly Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) ocata series: Confirmed Status in OpenStack Compute (nova) pike series: In Progress Bug description: POST /resource_providers can fail with conflict (HTTP status 409) for (at least) two reasons: A provider with the specified UUID exists; *or* a provider with the specified *name* already exists. In SchedulerReportClient, _ensure_resource_provider uses helper method _create_resource_provider, whose logic goes like this: POST /resource_provider { 'uuid': <uuid>, 'name': <name> } if 201: cool, return the result if 409: LOG("Another thread created a provider with this *UUID*") GET /resource_provider/<uuid> if 200: cool, return the result if 404 or any other error: return None if any other error: return None PROBLEM: If a provider exists with the desired *name* (but a different UUID), this code will always return None (via that 404 path). PROBLEM: Nobody up the stack is checking the return for None. What this effectively means is that _ensure_resource_provider... doesn't. IMO we should raise an exception in these error paths, forcing consuming code to handle them explicitly. But at the very least, any code consumuing _ensure_resource_provider needs to validate that it succeeds. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1735430/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

