Reviewed: https://review.opendev.org/689861 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3f9411071d4c1a04ab0b68fd635597bf6959c0ca Submitter: Zuul Branch: master
commit 3f9411071d4c1a04ab0b68fd635597bf6959c0ca Author: Sean Mooney <[email protected]> Date: Mon Oct 21 16:17:17 2019 +0000 Disable NUMATopologyFilter on rebuild This change leverages the new NUMA constraint checking added in in I0322d872bdff68936033a6f5a54e8296a6fb3434 to allow the NUMATopologyFilter to be skipped on rebuild. As the new behavior of rebuild enfroces that no changes to the numa constraints are allowed on rebuild we no longer need to execute the NUMATopologyFilter. Previously the NUMATopologyFilter would process the rebuild request as if it was a request to spawn a new instnace as the numa_fit_instance_to_host function is not rebuild aware. As such prior to this change a rebuild would only succeed if a host had enough additional capacity for a second instance on the same host meeting the requirement of the new image and existing flavor. This behavior was incorrect on two counts as a rebuild uses a noop claim. First the resouce usage cannot change so it was incorrect to require the addtional capacity to rebuild an instance. Secondly it was incorrect not to assert the resouce usage remained the same. I0322d872bdff68936033a6f5a54e8296a6fb3434 adressed guarding the rebuild against altering the resouce usage and this change allows in place rebuild. This change found a latent bug that will be adressed in a follow up change and updated the functional tests to note the incorrect behavior. Change-Id: I48bccc4b9adcac3c7a3e42769c11fdeb8f6fd132 Closes-Bug: #1804502 Implements: blueprint inplace-rebuild-of-numa-instances ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1804502 Title: Rebuild server with NUMATopologyFilter enabled fails (in some cases) Status in OpenStack Compute (nova): Fix Released Bug description: Description =========== server rebuild will fail in nova scheduler on NUMATopologyFilter if the computes do not have enough capacity (even though clearly the running server is already accounted into that calculation) to resolve the issue a fix is required in NUMATopologyFilter to not perform the rebuild operation in the case that the request is due to rebuild. the result of such a case will be that server rebuild will fail with error of "no valid host found" (do not mix resize with rebuild functions...) Steps to reproduce ================== 1. create a flavor that contain metadata that will point to a specific compute (use host aggregate with same key:value metadata make sure flavor contain topology related metadata: hw:cpu_cores='1', hw:cpu_policy='dedicated', hw:cpu_sockets='6', hw:cpu_thread_policy='prefer', hw:cpu_threads='1', hw:mem_page_size='large', location='area51' 2. create a server on that compute (preferably using heat stack) 3. (try to) rebuild the server using stack update 4. issue reproduced Expected result =============== server in an active running state (if image was replaced in the rebuild command than with a reference to the new image in the server details. Actual result ============= server in error state with error of no valid host found. Message No valid host was found. There are not enough hosts available. Code 500 Details File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 966, in rebuild_instance return_alternates=False) File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 723, in _schedule_instances return_alternates=return_alternates) File "/usr/lib/python2.7/site-packages/nova/scheduler/utils.py", line 907, in wrapped return func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 53, in select_destinations instance_uuids, return_objects, return_alternates) File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 37, in __run_method return getattr(self.instance, __name)(*args, **kwargs) File "/usr/lib/python2.7/site-packages/nova/scheduler/client/query.py", line 42, in select_destinations instance_uuids, return_objects, return_alternates) File "/usr/lib/python2.7/site-packages/nova/scheduler/rpcapi.py", line 158, in select_destinations return cctxt.call(ctxt, 'select_destinations', **msg_args) File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 179, in call retry=self.retry) File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 133, in _send retry=retry) File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 584, in send call_monitor_timeout, retry=retry) File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 575, in _send raise result Environment =========== detected in Rocky release KVM hypervisor Ceph storage Neutron networks Logs & Configs ============== in nova.conf: enabled_filters=AggregateInstanceExtraSpecsFilter,RetryFilter,AvailabilityZoneFilter,NUMATopologyFilter,PciPassthroughFilter,RamFilter,ComputeFilter,ImagePropertiesFilter,CoreFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter,DiskFilter,ComputeCapabilitiesFilter,AggregateRamFilter,SameHostFilter,DifferentHostFilter logs: tbd To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1804502/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

