Reviewed: https://review.openstack.org/139770 Committed: https://git.openstack.org/cgit/openstack-dev/devstack/commit/?id=0354640587cde740aa0299c722f019ae1c01e05d Submitter: Jenkins Branch: master
commit 0354640587cde740aa0299c722f019ae1c01e05d Author: Adam Gandelman <[email protected]> Date: Fri Dec 5 16:49:12 2014 -0800 Move ironic ssh key creation early in preparation SSH creds should be in place before nodes are enrolled. If not, ironic cannot sync power state causing nova to skip nodes in its resource tracker. Change-Id: I6b98ae57ce33783f69e2cf9ba357807d384b3012 Closes-bug: #1398128 ** Changed in: devstack Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1398128 Title: ironic tempest tests periodically failing: No valid host was found Status in devstack - openstack dev environments: Fix Released Status in OpenStack Bare Metal Provisioning Service (Ironic): Invalid Status in OpenStack Compute (Nova): Invalid Bug description: This was noticed on the stable/juno ironic sideways grenade jobs, but is also confirmed to be happening on the check-tempest-dsvm-ironic- parallel-nv job, which runs a similarly configured tempest run against Ironic: http://logs.openstack.org/84/137684/1/check/check-grenade-dsvm-ironic- sideways/6d118bc/ A number of the early compute tests will fail to spawn an instance, getting a scheduling error on the client side: BuildErrorException: Server %(server_id)s failed to build and is in ERROR status Details: Server eb81ee40-ceba-484d-b665-92ec3bf4fedd failed to build and is in ERROR status Details: {u'message': u'No valid host was found. ', u'created': u'2014-11-27T17:44:05Z', u'code': 500} Looking through the nova logs, the request never even makes to the nova-scheduler. The last error is reported in conductor: 2014-11-27 17:44:01.005 WARNING nova.scheduler.driver [req-a3c046e5 -66db-4bca-a6f8-2263763e49a6 SecurityGroupsTestJSON-2119055496 SecurityGroupsTestJSON-1381566740] [instance: 9008811a-f400-42ae- 98d5-caf828fa34dc] NoValidHost exception with message: 'No valid host was found.' Looking at the time stamps of the requests, the first instance is requested at 17:44:00 2014-11-27 17:44:00.944 24730 DEBUG tempest.common.rest_client [req- a3c046e5-66db-4bca-a6f8-2263763e49a6 None] Request (SecurityGroupsTestJSON:test_server_security_groups): 202 POST http://127.0.0.1:8774/v2/adf4838f0d15462da4601a5d853eafbf/servers 0.515s However, on the nova-compute side, the resource tracker has not been updated to include the enlisted Ironic nodes until much later. This first time the tracker contains any of the ironic resources is at 17:44:06: 2014-11-27 17:44:06.224 21645 AUDIT nova.compute.resource_tracker [-] Total physical ram (MB): 512, total allocated virtual ram (MB): 0 So there's a race between the resource tracker's initial inclusion of available resources and Tempest running the first set of tests that require an instance. This can be worked around in a couple of ways: * Adjust the periodic task interval on nova-compute to update much more frequently, tho this will just narrow the window. * Have tempest run an admin 'nova hypervisor-stats' call on the client side and wait for resources before running any instances (in the case of baremetal only) * Adjust devstack's nova cpu deployment to spin until hypervisor-stats reflect the ironic node parameters To manage notifications about this bug go to: https://bugs.launchpad.net/devstack/+bug/1398128/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

