@nova team: https://docs.openstack.org/nova/latest/reference/api-
microversion-history.html#maximum-in-2023-1-antelope-and-2023-2-bobcat



If compute service is down in source node and user try to stop instance,
instance gets stuck at powering-off, hence evacuation fails with msg:
Cannot ‘evacuate’ instance <instance-id> while it is in task_state
powering-off. It is now possible for evacuation to ignore the vm task
state. For more details see: bug 1978983


** Also affects: nova
   Importance: Undecided
       Status: New

** Changed in: nova
       Status: New => In Progress

** Summary changed:

- [Caracal][Offline][Masakari] -  Instance-HA partially working
+ [Caracal][Offline][Masakari/Nova] -  Instance-HA partially working

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2119126

Title:
  [Caracal][Offline][Masakari/Nova] -  Instance-HA partially working

Status in masakari:
  In Progress
Status in OpenStack Compute (nova):
  In Progress

Bug description:
  ++++++++++++
  ENV Details:
  ++++++++++++

  OSA Version: OFFLINE Caracal 2024.1
  OS: Ubuntu-22.04
  Tool: OpenStack-Ansible 
  Virtual setup 
  ++++++
  Issue:
  ++++++

  * Masakari is installed and running. 
  * Created a instance and enabled ``HA_Enabled=True` properties. 
  * Instance is tunning on the source node `cmpt001` destination node is 
``offline20241``
  * On source node Started instance-ha operation by :-  ### or if there is a 
better way let me know 

  ```
  #  Source compute 

  root@cmpt001:~# virsh list --all
   Id   Name                State
  -----------------------------------
   4    instance-0000001e   running
  root@cmpt001:~# 

  ```
  systemctl stop nova-compute.service
  systemctl stop pacemaker.service
  kill -9 $(ps -eaf | grep  instance-0000001e | awk '{print $2}' | head -n 1)
  systemctl stop corosync.service
  ```

  ```
  root@cmpt001:~# 
  root@cmpt001:~# virsh list --all
   Id   Name                State
  ------------------------------------
   -    instance-0000001e   shut off
  root@cmpt001:~# 
  ```

  * Evacuation started. it took 15 sec to migrate the instance. 
  * Instance evacuation is happeneing, however the instance on the destination 
node is showing the Status as ShutOff

  #Source compute logs

  ```
  Jul3014: 50: 25cmpt001.ct.lanmasakari-instancemonitor[
    1344
  ]: 2025-07-3014: 50: 
25.5551344INFOmasakarimonitors.instancemonitor.libvirt_handler.callback[
    -
  ]LibvirtEvent: type=VM,
  hostname=cmpt001.ct.lan,
  uuid=969ca417-9111-42de-836c-eb883e52f131,
  time=2025-07-3014: 50: 25.552570,
  event_id=LIFECYCLE,
  detail=STOPPED_FAILED)```

  ```

  ```
  Jul3014: 50: 25cmpt001.ct.lanmasakari-instancemonitor[
    1344
  ]: 2025-07-3014: 50: 25.5571344INFOmasakarimonitors.ha.masakari[
    -
  ]Sendanotification.{
    'notification': {
      'type': 'VM',
      'hostname': 'cmpt001.ct.lan',
      'generated_time': datetime.datetime(2025,
      7,
      30,
      14,
      50,
      25,
      552570),
      'payload': {
        'event': 'LIFECYCLE',
        'instance_uuid': '969ca417-9111-42de-836c-eb883e52f131',
        'vir_domain_event': 'STOPPED_FAILED'
      }
    }
  }

  ```
  Jul3014: 50: 25cmpt001.ct.lanmasakari-instancemonitor[
    1344
  ]: 2025-07-3014: 50: 25.7861344INFOmasakarimonitors.ha.masakari[
    -
  ]Response: openstack.instance_ha.v1.notification.Notification(type=VM,
  hostname=cmpt001.ct.lan,
  generated_time=2025-07-30T14: 50: 25.552570,
  payload={
    'event': 'LIFECYCLE',
    'instance_uuid': '969ca417-9111-42de-836c-eb883e52f131',
    'vir_domain_event': 'STOPPED_FAILED'
  },
  id=1,
  notification_uuid=3be2b8e5-ed78-4085-8898-54b3fd5a9f78,
  source_host_uuid=36cd2bdb-29e4-4cc4-9b10-a933e2608edc,
  status=new,
  created_at=2025-07-30T14: 50: 25.000000,
  updated_at=None,
  location=Munch({
    'cloud': '192.168.131.200',
    'region_name': 'RegionOne',
    'zone': None,
    'project': Munch({
      'id': 'a5aebb0fbfc64ac49e3ea028e4f740dc',
      'name': None,
      'domain_id': None,
      'domain_name': None
    })
  }))
  ```

  # Destination compute + controller logs

  
  ```
  root@offline20241:~# openstack server show 
969ca417-9111-42de-836c-eb883e52f131 --fit
  
+-------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  | Field                               | Value                                 
                                                                                
                                                                  |
  
+-------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  | OS-DCF:diskConfig                   | MANUAL                                
                                                                                
                                                                  |
  | OS-EXT-AZ:availability_zone         | nova                                  
                                                                                
                                                                  |
  | OS-EXT-SRV-ATTR:host                | offline20241.ct.lan                   
                                                                                
                                                                  |
  | OS-EXT-SRV-ATTR:hostname            | nc-masak-002                          
                                                                                
                                                                  |
  | OS-EXT-SRV-ATTR:hypervisor_hostname | offline20241.ct.lan                   
                                                                                
                                                                  |
  | OS-EXT-SRV-ATTR:instance_name       | instance-0000001e                     
                                                                                
                                                                  |
  | OS-EXT-SRV-ATTR:kernel_id           |                                       
                                                                                
                                                                  |
  | OS-EXT-SRV-ATTR:launch_index        | 0                                     
                                                                                
                                                                  |
  | OS-EXT-SRV-ATTR:ramdisk_id          |                                       
                                                                                
                                                                  |
  | OS-EXT-SRV-ATTR:reservation_id      | r-01be9wmi                            
                                                                                
                                                                  |
  | OS-EXT-SRV-ATTR:root_device_name    | /dev/vda                              
                                                                                
                                                                  |
  | OS-EXT-SRV-ATTR:user_data           | None                                  
                                                                                
                                                                  |
  | OS-EXT-STS:power_state              | Running                               
                                                                                
                                                                  |
  | OS-EXT-STS:task_state               | None                                  
                                                                                
                                                                  |
  | OS-EXT-STS:vm_state                 | active                                
                                                                                
                                                                  |
  | OS-SRV-USG:launched_at              | 2025-07-30T14:52:30.000000            
                                                                                
                                                                  |
  | OS-SRV-USG:terminated_at            | None                                  
                                                                                
                                                                  |
  | accessIPv4                          |                                       
                                                                                
                                                                  |
  | accessIPv6                          |                                       
                                                                                
                                                                  |
  | addresses                           | provider141=192.168.141.91            
                                                                                
                                                                  |
  | config_drive                        |                                       
                                                                                
                                                                  |
  | created                             | 2025-07-30T14:46:27Z                  
                                                                                
                                                                  |
  | description                         | nc-masak-002                          
                                                                                
                                                                  |
  | flavor                              | description=, disk='0', 
ephemeral='0', , id='m1.tiny', is_disabled=, is_public='True', location=, 
name='m1.tiny', original_name='m1.tiny', ram='512', rxtx_factor=, swap='0', 
vcpus='1' |
  | hostId                              | 
1cb881e0cf1cb53f01ef64ad3b04badf5e418abc5a42566593853d4a                        
                                                                                
                        |
  | host_status                         | UP                                    
                                                                                
                                                                  |
  | id                                  | 969ca417-9111-42de-836c-eb883e52f131  
                                                                                
                                                                  |
  | image                               | N/A (booted from volume)              
                                                                                
                                                                  |
  | key_name                            | None                                  
                                                                                
                                                                  |
  | locked                              | False                                 
                                                                                
                                                                  |
  | locked_reason                       | None                                  
                                                                                
                                                                  |
  | name                                | nc-masak-002                          
                                                                                
                                                                  |
  | progress                            | 0                                     
                                                                                
                                                                  |
  | project_id                          | 2f5a2a06638942cbaaeeb466b2e17e10      
                                                                                
                                                                  |
  | properties                          | HA_Enabled='True'                     
                                                                                
                                                                  |
  | security_groups                     | name='secgroup1'                      
                                                                                
                                                                  |
  | server_groups                       | []                                    
                                                                                
                                                                  |
  | status                              | ACTIVE                                
                                                                                
                                                                  |
  | tags                                |                                       
                                                                                
                                                                  |
  | trusted_image_certificates          | None                                  
                                                                                
                                                                  |
  | updated                             | 2025-07-30T15:05:57Z                  
                                                                                
                                                                  |
  | user_id                             | cfc72fc0bc3d4ba09ef9756ea2fb6395      
                                                                                
                                                                  |
  | volumes_attached                    | delete_on_termination='False', 
id='b82238fb-fe80-41ac-b426-33de21eb6756'                                       
                                                                         |
  
+-------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

  
  root@offline20241:~# virsh list --all
   Id   Name                State
  -----------------------------------
   16   instance-0000001e   running
  root@offline20241:~#
  ```

  
  ### On the source node the vm was showing in virsh list as shutoff, after 
reboot source compoute node, virsh stale entry got removed, however the 
instance on the destination went to the stutoff state. both in server list and 
virsh list. 

  ```
  Jul 30 15:04:30 offline20241.ct.lan nova-compute[2108454]: 2025-07-30 
15:04:30.670 2108454 DEBUG oslo_concurrency.lockutils [None 
req-a919b840-363e-42b5-8ff3-e2ecd871c94b - - - - - -] Acquiring lock 
"969ca417-9111-42de-836c-eb883e52f131" by 
"nova.compute.manager.ComputeManager._sync_power_states.<locals>._sync.<locals>.query_driver_power_state_and_sync"
 inner 
/openstack/venvs/nova-29.2.3.dev1/lib/python3.10/site-packages/oslo_concurrency/lockutils.py:402
  Jul 30 15:04:30 offline20241.ct.lan nova-compute[2108454]: 2025-07-30 
15:04:30.671 2108454 DEBUG oslo_concurrency.lockutils [None 
req-a919b840-363e-42b5-8ff3-e2ecd871c94b - - - - - -] Lock 
"969ca417-9111-42de-836c-eb883e52f131" acquired by 
"nova.compute.manager.ComputeManager._sync_power_states.<locals>._sync.<locals>.query_driver_power_state_and_sync"
 :: waited 0.001s inner 
/openstack/venvs/nova-29.2.3.dev1/lib/python3.10/site-packages/oslo_concurrency/lockutils.py:407
  Jul 30 15:04:30 offline20241.ct.lan nova-compute[2108454]: 2025-07-30 
15:04:30.714 2108454 INFO nova.compute.manager [None 
req-a919b840-363e-42b5-8ff3-e2ecd871c94b - - - - - -] [instance: 
969ca417-9111-42de-836c-eb883e52f131] During _sync_instance_power_state the DB 
power_state (0) does not match the vm_power_state from the hypervisor (1). 
Updating power_state in the DB to match the hypervisor.
  Jul 30 15:04:30 offline20241.ct.lan neutron-server[3870]: 2025-07-30 
15:04:30.795 3870 WARNING neutron.db.agents_db [None 
req-c0ba4060-e268-4940-9e06-6366966d1a48 - - - - - -] Agent healthcheck: found 
4 dead agents out of 9:
                                                                            
Type       Last heartbeat host
                                                                      DHCP 
agent  2025-07-23 12:31:59 offline20241
                                                                        L3 
agent  2025-07-23 12:32:16 offline20241
                                                              Open vSwitch 
agent  2025-07-23 12:32:16 offline20241
                                                                  Metadata 
agent  2025-07-23 12:31:59 offline20241
  Jul 30 15:04:30 offline20241.ct.lan nova-compute[2108454]: 2025-07-30 
15:04:30.817 2108454 WARNING nova.compute.manager [None 
req-a919b840-363e-42b5-8ff3-e2ecd871c94b - - - - - -] [instance: 
969ca417-9111-42de-836c-eb883e52f131] Instance is not stopped. Calling the stop 
API. Current vm_state: stopped, current task_state: None, original DB 
power_state: 0, current VM power_state: 1
  Jul 30 15:04:30 offline20241.ct.lan nova-compute[2108454]: 2025-07-30 
15:04:30.818 2108454 DEBUG nova.compute.api [None 
req-a919b840-363e-42b5-8ff3-e2ecd871c94b - - - - - -] [instance: 
969ca417-9111-42de-836c-eb883e52f131] Going to try to stop instance force_stop 
/openstack/venvs/nova-29.2.3.dev1/lib/python3.10/site-packages/nova/compute/api.py:2768
  Jul 30 15:04:30 offline20241.ct.lan apache2[1957602]: 192.168.131.60 - - 
[30/Jul/2025:15:04:30 +0000] "POST /v3/auth/tokens HTTP/1.1" 201 9818 "-" 
"openstacksdk/3.0.0 keystoneauth1/5.6.1 python-requests/2.31.0 CPython/3.10.12"
  Jul 30 15:04:30 offline20241.ct.lan haproxy[370326]: 192.168.131.200:55974 
[30/Jul/2025:15:04:30.373] keystone_service-front-2 
keystone_service-back/offline20241 0/0/0/550/550 201 9770 - - ---- 129/1/0/0/0 
0/0 "POST /v3/auth/tokens HTTP/1.1"
  Jul 30 15:04:30 offline20241.ct.lan nova-compute[2108454]: 2025-07-30 
15:04:30.945 2108454 DEBUG oslo_concurrency.lockutils [None 
req-a919b840-363e-42b5-8ff3-e2ecd871c94b - - - - - -] Lock 
"969ca417-9111-42de-836c-eb883e52f131" "released" by 
"nova.compute.manager.ComputeManager._sync_power_states.<locals>._sync.<locals>.query_driver_power_state_and_sync"
 :: held 0.274s inner 
/openstack/venvs/nova-29.2.3.dev1/lib/python3.10/site-packages/oslo_concurrency/lockutils.py:421
  Jul 30 15:04:30 offline20241.ct.lan nova-compute[2108454]: 2025-07-30 
15:04:30.975 2108454 DEBUG oslo_concurrency.lockutils [None 
req-f69d5ab0-a27a-4006-8473-b4a9f670753d - - - - - -] Acquiring lock 
"969ca417-9111-42de-836c-eb883e52f131" by 
"nova.compute.manager.ComputeManager.stop_instance.<locals>.do_stop_instance" 
inner 
/openstack/venvs/nova-29.2.3.dev1/lib/python3.10/site-packages/oslo_concurrency/lockutils.py:402
  Jul 30 15:04:30 offline20241.ct.lan nova-compute[2108454]: 2025-07-30 
15:04:30.976 2108454 DEBUG oslo_concurrency.lockutils [None 
req-f69d5ab0-a27a-4006-8473-b4a9f670753d - - - - - -] Lock 
"969ca417-9111-42de-836c-eb883e52f131" acquired by 
"nova.compute.manager.ComputeManager.stop_instance.<locals>.do_stop_instance" 
:: waited 0.001s inner 
/openstack/venvs/nova-29.2.3.dev1/lib/python3.10/site-packages/oslo_concurrency/lockutils.py:407
  Jul 30 15:04:30 offline20241.ct.lan nova-compute[2108454]: 2025-07-30 
15:04:30.977 2108454 DEBUG nova.compute.manager [None 
req-f69d5ab0-a27a-4006-8473-b4a9f670753d - - - - - -] [instance: 
969ca417-9111-42de-836c-eb883e52f131] Checking state _get_power_state 
/openstack/venvs/nova-29.2.3.dev1/lib/python3.10/site-packages/nova/compute/manager.py:1782
  Jul 30 15:04:30 offline20241.ct.lan nova-compute[2108454]: 2025-07-30 
15:04:30.983 2108454 DEBUG nova.compute.manager [None 
req-f69d5ab0-a27a-4006-8473-b4a9f670753d - - - - - -] [instance: 
969ca417-9111-42de-836c-eb883e52f131] Stopping instance; current vm_state: 
stopped, current task_state: powering-off, current DB power_state: 1, current 
VM power_state: 1 do_stop_instance 
/openstack/venvs/nova-29.2.3.dev1/lib/python3.10/site-packages/nova/compute/manager.py:3359

  ```

To manage notifications about this bug go to:
https://bugs.launchpad.net/masakari/+bug/2119126/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to