OK looking at the stacktrace I see it's not the '_destroy_build_request' call that's blowing up on reschedule, it's the up-call to get the availability zone for the next chosen host from the list of alternates:
https://github.com/openstack/nova/blob/39b05ee9e34ae7e7c1854439f887588ec157bc69/nova/conductor/manager.py#L647 And if an AZ is not requested during server create, we are free to move the instance to another AZ during reschedule. So it seems we've fallen into a dreaded up-call hole here that needs to be tracked: https://docs.openstack.org/nova/latest/user/cellsv2-layout.html #operations-requiring-upcalls ** Summary changed: - CantStartEngineError in cell conductor during rebuild + CantStartEngineError in cell conductor during reschedule - get_host_availability_zone up-call ** Changed in: nova Status: New => Triaged ** Changed in: nova Importance: Undecided => Medium ** Tags added: cells ** Also affects: nova/pike Importance: Undecided Status: New ** Also affects: nova/queens Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1781286 Title: CantStartEngineError in cell conductor during reschedule - get_host_availability_zone up-call Status in OpenStack Compute (nova): Triaged Status in OpenStack Compute (nova) pike series: New Status in OpenStack Compute (nova) queens series: New Bug description: In a stable/queens devstack environment with multiple PowerVM compute nodes, everytime I see this in [email protected] logs: Jul 11 15:48:57 myhostname nova-conductor[3796]: DEBUG nova.conductor.manager [None req-af22375c-f920-4747-bd2f-0de80ee69465 admin admin] Rescheduling: True {{(pid=4108) build_instances /opt/stack/nova/nova/conductor/manager.py:571}} it is shortly thereafter followed by: Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server [None req-af22375c-f920-4747-bd2f-0de80ee69465 admin admin] Exception during message handling: CantStartEngineError: No sql_connection parameter is established Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server Traceback (most recent call last): Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 220, in dispatch Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 190, in _do_dispatch Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server File "/opt/stack/nova/nova/conductor/manager.py", line 652, in build_instances Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server host.service_host)) Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server File "/opt/stack/nova/nova/availability_zones.py", line 95, in get_host_availability_zone Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server key='availability_zone') Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 184, in wrapper Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server result = fn(cls, context, *args, **kwargs) Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server File "/opt/stack/nova/nova/objects/aggregate.py", line 541, in get_by_host Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server _get_by_host_from_db(context, host, key=key)] Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 987, in wrapper Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server with self._transaction_scope(context): Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__ Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server return self.gen.next() Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 1037, in _transaction_scope Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server context=context) as resource: Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__ Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server return self.gen.next() Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 640, in _session Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server bind=self.connection, mode=self.mode) Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 404, in _create_session Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server self._start() Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 491, in _start Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server engine_args, maker_args) Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 513, in _setup_for_connection Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server "No sql_connection parameter is established") Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server CantStartEngineError: No sql_connection parameter is established Jul 11 15:48:57 myhostname nova-conductor[3796]: ERROR oslo_messaging.rpc.server The nova_cell1.conf does have [database]connection set: [database] connection = mysql+pymysql://root:[email protected]/nova_cell1?charset=utf8 This may be related to https://bugs.launchpad.net/nova/+bug/1736946 , though that is supposedly fixed in stable/queens and the trace is different, hence the new defect. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1781286/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

