The related issue is that the scheduler was not filtering out deleted compute node records when pulling them from the cell DB:
https://github.com/openstack/nova/blob/d87852ae6a1987b6faa3cb5851f9758b47ef4636/nova/objects/compute_node.py#L443 Because ^ that query doesn't filter out deleted records. Granted, if the resource provider record in placement was cleaned up properly, we wouldn't have gotten that far anyway, but it's still an issue. ** Changed in: nova Status: Invalid => Triaged ** Changed in: nova Importance: Undecided => Medium ** Summary changed: - Deleting a service with nova-compute binary doesn't remove compute node + Scheduler doesn't filter out deleted compute node records based on placement RP UUIDs ** Also affects: nova/pike Importance: Undecided Status: New ** Also affects: nova/rocky Importance: Undecided Status: New ** Also affects: nova/ocata Importance: Undecided Status: New ** Also affects: nova/queens Importance: Undecided Status: New ** Changed in: nova/ocata Status: New => Triaged ** Changed in: nova/pike Status: New => Triaged ** Changed in: nova/queens Status: New => Triaged ** Changed in: nova/rocky Status: New => Incomplete ** Changed in: nova/rocky Status: Incomplete => Triaged -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1793533 Title: Scheduler doesn't filter out deleted compute node records based on placement RP UUIDs Status in OpenStack Compute (nova): Triaged Status in OpenStack Compute (nova) ocata series: Triaged Status in OpenStack Compute (nova) pike series: Triaged Status in OpenStack Compute (nova) queens series: Triaged Status in OpenStack Compute (nova) rocky series: Triaged Bug description: If you are taking a nova-compute service out of service permanently, the logical steps would be: 1) Take down the service 2) Delete it from the service list (nova service-delete <uuid>) However, this does not delete the compute node record which stays forever, leading to the scheduler to always complain about it as well: 2018-09-20 13:15:45.312 131035 WARNING nova.scheduler.host_manager [req-c4a7c383-c606-48a7-b870-cc143710114a 234412d3482f4707877ca696e105bf5b acb15d2ffaae4eda98580c7b874d7f89 - default default] No compute service record found for host <snip>.vexxhost.net https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L716-L720 We should be deleting the compute node if a nova-compute binary is deleted, or that section should automatically clean up while warning (because service records can be rebuilt anyways?) To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1793533/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

