Public bug reported:

All block live migrations are broken when I want nova to calculate live
migration type by specifying {'block_migration': 'auto'} in request
body. This happens because block_migration and
migrate_data.block_migration flags do not have the same value.

In conductor live migrate task we call checks on destination and source
that builds up migrate_data in driver and sends them back to conductor:

https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L156

Here we calculate block migration, this is fine:

https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L5554

Then it goes back to conductor and we call compute manager sending both
flags - block_migration and migrate_data.block_migration - but we never
change value of block_migration to match migrate_data.block_migration:

https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L68

Because down in compute manager (and in drivers) we use both flags that
have different values (here block_migration=None,
migrate_data.block_migration=True), e.g. at this point
block_migration=None:

https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L5196

We break all block live migrations with:

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, 
in fire_timers
    timer()
  File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 
58, in __call__
    cb(*args, **kw)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 168, in 
_do_send
    waiter.switch(result)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 
214, in main
    result = function(*args, **kwargs)
  File "/opt/stack/nova/nova/utils.py", line 1160, in context_wrapper
    return func(*args, **kwargs)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6095, in 
_live_migration_operation
    instance=instance)
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
220, in __exit__
    self.force_reraise()
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6063, in 
_live_migration_operation
    CONF.libvirt.live_migration_bandwidth)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in 
doit
    result = proxy_call(self._autowrap, f, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in 
proxy_call
    rv = execute(f, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in 
execute
    six.reraise(c, e, tb)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in 
tworker
    rv = meth(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1825, in 
migrateToURI2
    if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', 
dom=self)
libvirtError: Cannot access storage file 
'/opt/stack/data/nova/instances/572ad149-b7c5-4b77-85b5-34c1d2d37fcf/disk' (as 
uid:110, gid:116): No such file or directory

Fast workaround is making sure at compute manager level that
block_migration == migrate_data.block_migration, but really we should
cleanup all this mess and send only one flag, because it is error-prone
and hard to maintain.

** Affects: nova
     Importance: Critical
     Assignee: Pawel Koniszewski (pawel-koniszewski)
         Status: In Progress


** Tags: live-migration

** Description changed:

  All block live migrations are broken when I want nova to calculate live
  migration type by specifying {'block_migration': 'auto'} in request
  body. This happens because block_migration and
  migrate_data.block_migration flags do not have the same value.
  
  In conductor live migrate task we call checks on destination and source
  that builds up migrate_data in driver and sends them back to conductor:
  
  
https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L156
  
  Here we calculate block migration, this is fine:
  
  
https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L5554
  
  Then it goes back to conductor and we call compute manager sending both
  flags - block_migration and migrate_data.block_migration - but we never
- changed value of block_migration to match migrate_data.block_migration:
+ change value of block_migration to match migrate_data.block_migration:
  
  
https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L68
  
  Because down in compute manager (and in drivers) we use both flags that
  have different values (here block_migration=None,
  migrate_data.block_migration=True), e.g. at this point
  block_migration=None:
  
  https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L5196
  
  We break all block live migrations with:
  
  Traceback (most recent call last):
-   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 
457, in fire_timers
-     timer()
-   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 
58, in __call__
-     cb(*args, **kw)
-   File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 168, 
in _do_send
-     waiter.switch(result)
-   File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 
214, in main
-     result = function(*args, **kwargs)
-   File "/opt/stack/nova/nova/utils.py", line 1160, in context_wrapper
-     return func(*args, **kwargs)
-   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6095, in 
_live_migration_operation
-     instance=instance)
-   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
220, in __exit__
-     self.force_reraise()
-   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
196, in force_reraise
-     six.reraise(self.type_, self.value, self.tb)
-   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6063, in 
_live_migration_operation
-     CONF.libvirt.live_migration_bandwidth)
-   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, 
in doit
-     result = proxy_call(self._autowrap, f, *args, **kwargs)
-   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, 
in proxy_call
-     rv = execute(f, *args, **kwargs)
-   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, 
in execute
-     six.reraise(c, e, tb)
-   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, 
in tworker
-     rv = meth(*args, **kwargs)
-   File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1825, in 
migrateToURI2
-     if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', 
dom=self)
+   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 
457, in fire_timers
+     timer()
+   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 
58, in __call__
+     cb(*args, **kw)
+   File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 168, 
in _do_send
+     waiter.switch(result)
+   File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 
214, in main
+     result = function(*args, **kwargs)
+   File "/opt/stack/nova/nova/utils.py", line 1160, in context_wrapper
+     return func(*args, **kwargs)
+   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6095, in 
_live_migration_operation
+     instance=instance)
+   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
220, in __exit__
+     self.force_reraise()
+   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
196, in force_reraise
+     six.reraise(self.type_, self.value, self.tb)
+   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6063, in 
_live_migration_operation
+     CONF.libvirt.live_migration_bandwidth)
+   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, 
in doit
+     result = proxy_call(self._autowrap, f, *args, **kwargs)
+   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, 
in proxy_call
+     rv = execute(f, *args, **kwargs)
+   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, 
in execute
+     six.reraise(c, e, tb)
+   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, 
in tworker
+     rv = meth(*args, **kwargs)
+   File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1825, in 
migrateToURI2
+     if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', 
dom=self)
  libvirtError: Cannot access storage file 
'/opt/stack/data/nova/instances/572ad149-b7c5-4b77-85b5-34c1d2d37fcf/disk' (as 
uid:110, gid:116): No such file or directory
  
  Fast workaround is making sure at compute manager level that
  block_migration == migrate_data.block_migration, but really we should
  cleanup all this mess and send only one flag, because it is error-prone
  and hard to maintain.

** Changed in: nova
   Importance: Undecided => Critical

** Changed in: nova
       Status: New => In Progress

** Changed in: nova
     Assignee: (unassigned) => Pawel Koniszewski (pawel-koniszewski)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1552303

Title:
  Block live migrations are broken when nova calculates live migration
  type by itself

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  All block live migrations are broken when I want nova to calculate
  live migration type by specifying {'block_migration': 'auto'} in
  request body. This happens because block_migration and
  migrate_data.block_migration flags do not have the same value.

  In conductor live migrate task we call checks on destination and
  source that builds up migrate_data in driver and sends them back to
  conductor:

  
https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L156

  Here we calculate block migration, this is fine:

  
https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L5554

  Then it goes back to conductor and we call compute manager sending
  both flags - block_migration and migrate_data.block_migration - but we
  never change value of block_migration to match
  migrate_data.block_migration:

  
https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L68

  Because down in compute manager (and in drivers) we use both flags
  that have different values (here block_migration=None,
  migrate_data.block_migration=True), e.g. at this point
  block_migration=None:

  https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L5196

  We break all block live migrations with:

  Traceback (most recent call last):
    File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 
457, in fire_timers
      timer()
    File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 
58, in __call__
      cb(*args, **kw)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 168, 
in _do_send
      waiter.switch(result)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 
214, in main
      result = function(*args, **kwargs)
    File "/opt/stack/nova/nova/utils.py", line 1160, in context_wrapper
      return func(*args, **kwargs)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6095, in 
_live_migration_operation
      instance=instance)
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6063, in 
_live_migration_operation
      CONF.libvirt.live_migration_bandwidth)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, 
in doit
      result = proxy_call(self._autowrap, f, *args, **kwargs)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, 
in proxy_call
      rv = execute(f, *args, **kwargs)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, 
in execute
      six.reraise(c, e, tb)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, 
in tworker
      rv = meth(*args, **kwargs)
    File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1825, in 
migrateToURI2
      if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', 
dom=self)
  libvirtError: Cannot access storage file 
'/opt/stack/data/nova/instances/572ad149-b7c5-4b77-85b5-34c1d2d37fcf/disk' (as 
uid:110, gid:116): No such file or directory

  Fast workaround is making sure at compute manager level that
  block_migration == migrate_data.block_migration, but really we should
  cleanup all this mess and send only one flag, because it is error-
  prone and hard to maintain.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1552303/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to