Reviewed: https://review.opendev.org/700186 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0d9622f581e830e7b7bc9763aaa09ba02e99b8bb Submitter: Zuul Branch: master
commit 0d9622f581e830e7b7bc9763aaa09ba02e99b8bb Author: Matt Riedemann <[email protected]> Date: Fri Dec 20 10:03:23 2019 -0500 Handle cell failures in get_compute_nodes_by_host_or_node get_compute_nodes_by_host_or_node uses the scatter_gather_cells function but was not handling the case that a failure result was returned, which could be the called function raising some exception or the cell timing out. This causes issues when the caller of get_compute_nodes_by_host_or_node expects to get a ComputeNodeList back and can do something like len(nodes) on it which fails when the result is not iterable. To be clear, if a cell is down there are going to be problems which likely result in a NoValidHost error during scheduling, but this avoids an ugly TypeError traceback in the scheduler logs. Change-Id: Ia54b5adf0a125ae1f9b86887a07dd1d79821dd54 Closes-Bug: #1857139 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1857139 Title: TypeError: object of type 'object' has no len() from resources_from_request_spec when cells are down Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) train series: Confirmed Bug description: Seen here: https://zuul.opendev.org/t/openstack/build/c187e207bc1c48a0a7fa49ef9798b696/log/logs/screen-n-sch.txt.gz#2529 cell1 is down so the call to scatter_gather_cells in get_compute_nodes_by_host_or_node yields a result but it's not a ComputeNodeList, it's the did_not_respond_sentinel object: https://github.com/openstack/nova/blob/02019d2660dfce3facdd64ecdb2bd60ba4a91c6d/nova/scheduler/host_manager.py#L705 https://github.com/openstack/nova/blob/02019d2660dfce3facdd64ecdb2bd60ba4a91c6d/nova/context.py#L454 which results in an error here: https://github.com/openstack/nova/blob/02019d2660dfce3facdd64ecdb2bd60ba4a91c6d/nova/scheduler/utils.py#L612 The HostManager.get_compute_nodes_by_host_or_node method should filter out fail/timeout results from the scatter_gather_cells results. We'll get a NoValidHost either way but this is better than the traceback with the TypeError in it. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1857139/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

