Public bug reported: Steps to reproduce ================== Precondition: Need fresh openstack deployment. Database tables nova.compute_nodes and nova_api.host_mappings must be empty. In other words baremetal nodes were not added to ironic database yet. It HA deployment. Need to have at least two ironic-conductors running on different servers.
Steps: 1. Create baremetal node . "openstack baremetal node create ..." 2. Change node's state to manageable 3. After sometime "nova hypervisor-list" should list a hypervisor with same UUID as the baremetal node. 3.1 Database should like below MariaDB [(none)]> select uuid, host, mapped from nova.compute_nodes; +--------------------------------------+-------------+--------+ | uuid | host | mapped | +--------------------------------------+-------------+--------+ | d394aa91-3544-417c-acab-916a22e5a5b5 | ironic.aio1 | 1 | +--------------------------------------+-------------+--------+ MariaDB [(none)]> select * from nova_api.host_mappings; +---------------------+------------+----+---------+-------------+ | created_at | updated_at | id | cell_id | host | +---------------------+------------+----+---------+-------------+ | 2019-04-22 09:14:23 | NULL | 22 | 7 | ironic.aio1 | +---------------------+------------+----+---------+-------------+ 4. Call "nova hypervisor-show <hypervisor UUID>" in order to find out server where ironic-conductor is running. Log into that server and stop ironic-conductor. Need to force hashring to rebuild it's state. Wait for about five minutes. 5. Check output of "nova hypervisor-list". The hypervisor is absent. Result ================== Look inside database (see below). ironic.aio3 took the baremetal thus node nova changed 'host' field of compute (d394aa91-3544-417c-acab-916a22e5a5b5) to 'ironic.aio3'. Because of mapped = 1 'nova-manage cell_v2 discover_hosts' (run preiodically https://bugs.launchpad.net/nova/+bug/1715646) does not try to create host mapping. MariaDB [(none)]> select uuid, host, mapped from nova.compute_nodes; +--------------------------------------+-------------+--------+ | uuid | host | mapped | +--------------------------------------+-------------+--------+ | d394aa91-3544-417c-acab-916a22e5a5b5 | ironic.aio3 | 1 | +--------------------------------------+-------------+--------+ MariaDB [(none)]> select * from nova_api.host_mappings; +---------------------+------------+----+---------+-------------+ | created_at | updated_at | id | cell_id | host | +---------------------+------------+----+---------+-------------+ | 2019-04-22 09:14:23 | NULL | 22 | 7 | ironic.aio1 | +---------------------+------------+----+---------+-------------+ 2019-04-22 19:54:00.813 8 WARNING nova.compute.resource_tracker [req-1ded2c35-d0e4-4719-a15d-3a83594bab1c - - - - -] No compute node record for ironic.aio3:5f9c2619-30bb-40d2-8b62-8923f04d90f2: ComputeHostNotFound_Remote: Compute host ironic.aio3 could not be found. 2019-04-22 19:54:00.831 8 INFO nova.compute.resource_tracker [req-1ded2c35-d0e4-4719-a15d-3a83594bab1c - - - - -] ComputeNode 5f9c2619-30bb-40d2-8b62-8923f04d90f2 moving from ironic.aio1 to ironic.aio3 2019-04-22 19:54:00.891 8 DEBUG nova.virt.ironic.driver [req-1ded2c35-d0e4-4719-a15d-3a83594bab1c - - - - -] Using cache for node 5f9c2619-30bb-40d2-8b62-8923f04d90f2, age: 0.0979330539703 _node_from_cache /usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py:860 Missing record in host_mappings table causes nova to print "Unable to find service" DEBUG message (see below). The compute become 'invisible'. See source code nova/api/openstack/compute/hypervisors.py:HypervisorsController._get_hypervisors 108 def _get_hypervisors(self, req, detail=False, limit=None, marker=None, 109 links=False): 110 """Get hypervisors for the given request. 111 112 :param req: nova.api.openstack.wsgi.Request for the GET request ... 161 hypervisors_list = [] 162 for hyp in compute_nodes: 163 try: 164 instances = None 165 if with_servers: 166 instances = self.host_api.instance_get_all_by_host( 167 context, hyp.host) 168 service = self.host_api.service_get_by_compute_host( 169 context, hyp.host) 170 hypervisors_list.append( 171 self._view_hypervisor( 172 hyp, service, detail, req, servers=instances)) 173 except (exception.ComputeHostNotFound, 174 exception.HostMappingNotFound): 175 # The compute service could be deleted which doesn't delete 176 # the compute node record, that has to be manually removed 177 # from the database so we just ignore it when listing nodes. 178 LOG.debug('Unable to find service for compute node %s. The ' 179 'service may be deleted and compute nodes need to ' 180 'be manually cleaned up.', hyp.host) ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1825876 Title: Ironic hypervisor disappears once hashring got rebuilt Status in OpenStack Compute (nova): New Bug description: Steps to reproduce ================== Precondition: Need fresh openstack deployment. Database tables nova.compute_nodes and nova_api.host_mappings must be empty. In other words baremetal nodes were not added to ironic database yet. It HA deployment. Need to have at least two ironic-conductors running on different servers. Steps: 1. Create baremetal node . "openstack baremetal node create ..." 2. Change node's state to manageable 3. After sometime "nova hypervisor-list" should list a hypervisor with same UUID as the baremetal node. 3.1 Database should like below MariaDB [(none)]> select uuid, host, mapped from nova.compute_nodes; +--------------------------------------+-------------+--------+ | uuid | host | mapped | +--------------------------------------+-------------+--------+ | d394aa91-3544-417c-acab-916a22e5a5b5 | ironic.aio1 | 1 | +--------------------------------------+-------------+--------+ MariaDB [(none)]> select * from nova_api.host_mappings; +---------------------+------------+----+---------+-------------+ | created_at | updated_at | id | cell_id | host | +---------------------+------------+----+---------+-------------+ | 2019-04-22 09:14:23 | NULL | 22 | 7 | ironic.aio1 | +---------------------+------------+----+---------+-------------+ 4. Call "nova hypervisor-show <hypervisor UUID>" in order to find out server where ironic-conductor is running. Log into that server and stop ironic-conductor. Need to force hashring to rebuild it's state. Wait for about five minutes. 5. Check output of "nova hypervisor-list". The hypervisor is absent. Result ================== Look inside database (see below). ironic.aio3 took the baremetal thus node nova changed 'host' field of compute (d394aa91-3544-417c-acab-916a22e5a5b5) to 'ironic.aio3'. Because of mapped = 1 'nova-manage cell_v2 discover_hosts' (run preiodically https://bugs.launchpad.net/nova/+bug/1715646) does not try to create host mapping. MariaDB [(none)]> select uuid, host, mapped from nova.compute_nodes; +--------------------------------------+-------------+--------+ | uuid | host | mapped | +--------------------------------------+-------------+--------+ | d394aa91-3544-417c-acab-916a22e5a5b5 | ironic.aio3 | 1 | +--------------------------------------+-------------+--------+ MariaDB [(none)]> select * from nova_api.host_mappings; +---------------------+------------+----+---------+-------------+ | created_at | updated_at | id | cell_id | host | +---------------------+------------+----+---------+-------------+ | 2019-04-22 09:14:23 | NULL | 22 | 7 | ironic.aio1 | +---------------------+------------+----+---------+-------------+ 2019-04-22 19:54:00.813 8 WARNING nova.compute.resource_tracker [req-1ded2c35-d0e4-4719-a15d-3a83594bab1c - - - - -] No compute node record for ironic.aio3:5f9c2619-30bb-40d2-8b62-8923f04d90f2: ComputeHostNotFound_Remote: Compute host ironic.aio3 could not be found. 2019-04-22 19:54:00.831 8 INFO nova.compute.resource_tracker [req-1ded2c35-d0e4-4719-a15d-3a83594bab1c - - - - -] ComputeNode 5f9c2619-30bb-40d2-8b62-8923f04d90f2 moving from ironic.aio1 to ironic.aio3 2019-04-22 19:54:00.891 8 DEBUG nova.virt.ironic.driver [req-1ded2c35-d0e4-4719-a15d-3a83594bab1c - - - - -] Using cache for node 5f9c2619-30bb-40d2-8b62-8923f04d90f2, age: 0.0979330539703 _node_from_cache /usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py:860 Missing record in host_mappings table causes nova to print "Unable to find service" DEBUG message (see below). The compute become 'invisible'. See source code nova/api/openstack/compute/hypervisors.py:HypervisorsController._get_hypervisors 108 def _get_hypervisors(self, req, detail=False, limit=None, marker=None, 109 links=False): 110 """Get hypervisors for the given request. 111 112 :param req: nova.api.openstack.wsgi.Request for the GET request ... 161 hypervisors_list = [] 162 for hyp in compute_nodes: 163 try: 164 instances = None 165 if with_servers: 166 instances = self.host_api.instance_get_all_by_host( 167 context, hyp.host) 168 service = self.host_api.service_get_by_compute_host( 169 context, hyp.host) 170 hypervisors_list.append( 171 self._view_hypervisor( 172 hyp, service, detail, req, servers=instances)) 173 except (exception.ComputeHostNotFound, 174 exception.HostMappingNotFound): 175 # The compute service could be deleted which doesn't delete 176 # the compute node record, that has to be manually removed 177 # from the database so we just ignore it when listing nodes. 178 LOG.debug('Unable to find service for compute node %s. The ' 179 'service may be deleted and compute nodes need to ' 180 'be manually cleaned up.', hyp.host) To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1825876/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

