Public bug reported: I see that in various jobs that some tests are failing because instance is in ERROR state. After some checking it seems for me that the issue is in scheduler as I see there errors like:
Oct 27 12:29:31.361318 ubuntu-bionic-ovh-bhs1-0012520618 nova- scheduler[19272]: WARNING nova.context [None req-9e056bb6-787f- 49fe-8896-41285d7418b0 tempest-ServersTestManualDisk-1766481780 tempest- ServersTestManualDisk-1766481780] Timed out waiting for response from cell 43118bd8-e32a-4aa4-b93a-37969e41dba6 or Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: WARNING nova.context [None req-37a8fab9-7d64-4ef4-9464-f70e5ed35d53 tempest-ServersTestJSON-1383648109 tempest-ServersTestJSON-1383648109] Timed out waiting for response from cell: CellTimeout: Timeout waiting for response from cell Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context Traceback (most recent call last): Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context File "/opt/stack/new/nova/nova/context.py", line 443, in scatter_gather_cells Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context cell_uuid, result = queue.get() Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context File "/usr/local/lib/python2.7/dist-packages/eventlet/queue.py", line 322, in get Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context return waiter.wait() Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context File "/usr/local/lib/python2.7/dist-packages/eventlet/queue.py", line 141, in wait Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context return get_hub().switch() Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 298, in switch Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context return self.greenlet.switch() Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context CellTimeout: Timeout waiting for response from cell Looking at logstash it seems that this happens quite often on various jobs: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Timed%20out%20waiting%20for%20response%20from%20cell%5C%22 ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1850291 Title: CI jobs fails due to instances in ERROR state Status in OpenStack Compute (nova): New Bug description: I see that in various jobs that some tests are failing because instance is in ERROR state. After some checking it seems for me that the issue is in scheduler as I see there errors like: Oct 27 12:29:31.361318 ubuntu-bionic-ovh-bhs1-0012520618 nova- scheduler[19272]: WARNING nova.context [None req-9e056bb6-787f- 49fe-8896-41285d7418b0 tempest-ServersTestManualDisk-1766481780 tempest-ServersTestManualDisk-1766481780] Timed out waiting for response from cell 43118bd8-e32a-4aa4-b93a-37969e41dba6 or Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: WARNING nova.context [None req-37a8fab9-7d64-4ef4-9464-f70e5ed35d53 tempest-ServersTestJSON-1383648109 tempest-ServersTestJSON-1383648109] Timed out waiting for response from cell: CellTimeout: Timeout waiting for response from cell Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context Traceback (most recent call last): Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context File "/opt/stack/new/nova/nova/context.py", line 443, in scatter_gather_cells Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context cell_uuid, result = queue.get() Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context File "/usr/local/lib/python2.7/dist-packages/eventlet/queue.py", line 322, in get Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context return waiter.wait() Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context File "/usr/local/lib/python2.7/dist-packages/eventlet/queue.py", line 141, in wait Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context return get_hub().switch() Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 298, in switch Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context return self.greenlet.switch() Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context CellTimeout: Timeout waiting for response from cell Looking at logstash it seems that this happens quite often on various jobs: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Timed%20out%20waiting%20for%20response%20from%20cell%5C%22 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1850291/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

