If you see some compute flapping due to some network issue, you can force it to 
be down : 
https://docs.openstack.org/api-ref/compute/?expanded=update-forced-down-detail#update-forced-down

Once the compute is down (because either it's forced down or by the
service group API), indeed you can evacuate the instance and then you
would have two different instances, once for the original one, and the
other one for the new host.

That said, given the original host is down, you should restart the compute 
service then once it's back up, right? If so, we then verify the evacuated 
instances and we delete them :
https://github.com/openstack/nova/blob/a1f006d799d2294234d381395a9ae9c22a2d80b9/nova/compute/manager.py#L1531


** Changed in: nova
       Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1968555

Title:
  evacuate after network issue will cause vm running on two host

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Environment
  ===========
  openstack queen + libvirt 4.5.0 + qemu 2.12 running on centos7, with ceph rbd 
storage

  Description
  ===========
  If the management network of the compute host is abnormal, it may cause 
nova-compute down but the openstack-nova-compute.service is still running on 
that host. Now you evacuate a vm on that host, the evacuate will succeed, the 
vm will be running both on the old host and the new host even after the 
management network of old host recover, it may cause vm error.   

  Steps to reproduce
  ==================
  1. Manually turn down the management network port of the compute host, like 
ifconfig eth0 down
  2. After the nova-compute of that host see down with openstack compute 
service list, evacuate one vm on that host:
  nova evacuate <vm's uuid>
  3. After evacuate succeed, you can find the vm running on two host.
  4. Manually turn up the management network port of the old compute host, like 
ifconfig eth0 up, you can find the vm still running on this host, it can't be 
auto destroy unless you restart the openstack-nova-compute.service on that host.

  Expected result
  ===============
  Maybe we can add a periodic task to auto destroy this vm?

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1968555/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to