Public bug reported:

After rebooting all nodes in the cluster, all the instances that were running 
on the cluster are stuck in Status ACTIVE, Task state: powering-off, Power 
state: Crashed.
>From the log it looks that during in nova-compute service start, messages sent 
>form init_host method vanished, because the start of RPC server is invoked 
>only afterwards.

The menager.init_host methods, see an instance with vm_state == 
vm_states.ACTIVE and vm_power_state in (power_state.SHUTDOWN, 
power_state.CRASHED). I get the log message "Instance shutdown by itself. 
Calling the stop API. Current vm_state: active, current task_state: None, 
original DB power_state: 1, current VM power_state: 6".
Then it calls the api.stop method, which invokes the api.force_stop method, and 
I see the following log message "Going to try to stop instance force_stop". 
This method invokes through RPC a stop_instance method. But the RPC message 
never reach the RPC server, which is started only after the init_host is called 
in service.start method.
Since I am using rabbitmq, the message queues after rebooting the cluster of 
nodes are not initiated, and the call never gets to the destination.

After wards, the _sync_instance_power_state see the powering-off task
state, and never cleans the instance state. I get the log messages:
"During sync_power_state the instance has a pending task (powering-off).
Skip."

Nova version is 12.0.0.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1593186

Title:
  Nova instance stuck in powering-off when rebooting all nodes in
  cluster

Status in OpenStack Compute (nova):
  New

Bug description:
  After rebooting all nodes in the cluster, all the instances that were running 
on the cluster are stuck in Status ACTIVE, Task state: powering-off, Power 
state: Crashed.
  From the log it looks that during in nova-compute service start, messages 
sent form init_host method vanished, because the start of RPC server is invoked 
only afterwards.

  The menager.init_host methods, see an instance with vm_state == 
vm_states.ACTIVE and vm_power_state in (power_state.SHUTDOWN, 
power_state.CRASHED). I get the log message "Instance shutdown by itself. 
Calling the stop API. Current vm_state: active, current task_state: None, 
original DB power_state: 1, current VM power_state: 6".
  Then it calls the api.stop method, which invokes the api.force_stop method, 
and I see the following log message "Going to try to stop instance force_stop". 
This method invokes through RPC a stop_instance method. But the RPC message 
never reach the RPC server, which is started only after the init_host is called 
in service.start method.
  Since I am using rabbitmq, the message queues after rebooting the cluster of 
nodes are not initiated, and the call never gets to the destination.

  After wards, the _sync_instance_power_state see the powering-off task
  state, and never cleans the instance state. I get the log messages:
  "During sync_power_state the instance has a pending task (powering-
  off). Skip."

  Nova version is 12.0.0.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1593186/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to