Reviewed: https://review.opendev.org/666857 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=a1a735bc6efa40d8277c9fc5339f3b74f968b58e Submitter: Zuul Branch: master
commit a1a735bc6efa40d8277c9fc5339f3b74f968b58e Author: Balazs Gibizer <[email protected]> Date: Fri Jun 21 16:48:14 2019 +0200 Error out interrupted builds If the compute service is restarted while build requests are executing the instance_claim or waiting for the COMPUTE_RESOURCE_SEMAPHORE then those instances will be stuck forever in BUILDING state. If the instance already finished instance_claim then instance.host is set and when the compute restarts the instance is put to ERROR state. This patch changes compute service startup to put instances into ERROR state if they a) are in the BUILDING state, and b) have allocations on the compute resource provider, but c) do not have instance.host set to that compute. Change-Id: I856a3032c83fc2f605d8c9b6e5aa3bcfa415f96a Closes-Bug: #1833581 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1833581 Title: instance stuck in BUILD state if nova-compute is restarted Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: Confirmed Status in OpenStack Compute (nova) rocky series: Confirmed Status in OpenStack Compute (nova) stein series: Confirmed Status in OpenStack Compute (nova) train series: Confirmed Bug description: Description =========== Instance stuck in BUILD state indefinitely if nova-compute service restarted in the mean time. Even after the instance_build_timeout the instance is not put into ERROR state. Steps to reproduce ================== 1) Start 10 VMs in parallel to increase the chance of hitting the bug $ for NUM in `seq 1 1 10`; do openstack server create --flavor c1 --image cirros-0.4.0-x86_64-disk --availability-zone nova:ubuntu vm$NUM & done 2) when the first instance reach the BUILD state restart the nova-compute service $ sudo systemctl restart [email protected] 3) Observer that instance states after the compute is up again. Expected result =============== Instances either in ACTIVE or in ERROR state. Actual result ============= Some instance stuck in BUILD state. Environment =========== all in one devstack build from recent nova master 61558f274842b149044a14bbe7537b9f278035fd Logs & Configs ============== stack@ubuntu:~$ openstack server list +--------------------------------------+------+--------+------------------------------------+--------------------------+-----------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+------+--------+------------------------------------+--------------------------+-----------+ | 9ee76601-4a61-4682-86f1-743dac2b05e6 | vm3 | BUILD | | cirros-0.4.0-x86_64-disk | cirros256 | | e459beae-ccb5-4781-b938-2dff68e33bf7 | vm9 | ACTIVE | public=2001:db8::181, 172.24.4.44 | cirros-0.4.0-x86_64-disk | cirros256 | | 562f44db-cd51-4516-bce9-598bd29c6310 | vm10 | ERROR | public=2001:db8::3a1, 172.24.4.196 | cirros-0.4.0-x86_64-disk | cirros256 | | 73f1e2c6-78a1-44c5-b178-7adcf9bf58a0 | vm5 | ERROR | public=2001:db8::21, 172.24.4.177 | cirros-0.4.0-x86_64-disk | cirros256 | | 1b01acfc-b798-48f9-b808-6cfd0d5cd3fb | vm6 | ERROR | public=2001:db8::3e1, 172.24.4.20 | cirros-0.4.0-x86_64-disk | cirros256 | | c709e3bf-9c71-4f64-bad3-e9e07e911f62 | vm7 | ERROR | public=2001:db8::231, 172.24.4.46 | cirros-0.4.0-x86_64-disk | cirros256 | | 538d2534-98f1-4e11-9bbb-b4e74bab8c65 | vm4 | ERROR | public=2001:db8::3e9, 172.24.4.157 | cirros-0.4.0-x86_64-disk | cirros256 | | ed74eb32-00fe-4f24-9379-c57c04ce9af1 | vm2 | ERROR | public=2001:db8::f5, 172.24.4.53 | cirros-0.4.0-x86_64-disk | cirros256 | | 582b5356-4f3d-42ed-937e-966580303af0 | vm8 | ERROR | public=2001:db8::92, 172.24.4.16 | cirros-0.4.0-x86_64-disk | cirros256 | | ae36ffca-e4d6-4353-8e7e-41db500a5e0d | vm1 | ERROR | public=2001:db8::1cf, 172.24.4.203 | cirros-0.4.0-x86_64-disk | cirros256 | +--------------------------------------+------+--------+------------------------------------+--------------------------+-----------+ stack@ubuntu:~$ openstack server show 9ee76601-4a61-4682-86f1-743dac2b05e6 +-------------------------------------+-----------------------------------------------------------------+ | Field | Value | +-------------------------------------+-----------------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | nova | | OS-EXT-SRV-ATTR:host | None | | OS-EXT-SRV-ATTR:hypervisor_hostname | None | | OS-EXT-SRV-ATTR:instance_name | instance-0000004c | | OS-EXT-STS:power_state | NOSTATE | | OS-EXT-STS:task_state | None | | OS-EXT-STS:vm_state | building | | OS-SRV-USG:launched_at | None | | OS-SRV-USG:terminated_at | None | | accessIPv4 | | | accessIPv6 | | | addresses | | | config_drive | | | created | 2019-06-19T02:30:16Z | | flavor | cirros256 (c1) | | hostId | | | id | 9ee76601-4a61-4682-86f1-743dac2b05e6 | | image | cirros-0.4.0-x86_64-disk (8b88f518-ab48-4859-8e8c-6988911ce9bd) | | key_name | None | | name | vm3 | | progress | 0 | | project_id | 2fc0b14ea1e041998f420ec85a89314d | | properties | | | status | BUILD | | updated | 2019-06-19T02:30:18Z | | user_id | 262d29f5f0c3445abbde89723b5f01ee | | volumes_attached | | +-------------------------------------+-----------------------------------------------------------------+ stack@ubuntu:~$ mysql> select uuid, host from instances where instances.uuid='9ee76601-4a61-4682-86f1-743dac2b05e6'; +--------------------------------------+------+ | uuid | host | +--------------------------------------+------+ | 9ee76601-4a61-4682-86f1-743dac2b05e6 | NULL | +--------------------------------------+------+ 1 row in set (0.00 sec) Logs for 9ee76601-4a61-4682-86f1-743dac2b05e6: http://paste.openstack.org/show/753228/ To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1833581/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

