Hi, Jay,

I confirmed this bug happened in my devstack environment using LVM/iSCSI driver.
>From my investigation of this issue, this might be bug of Libvirt.

When Nova calls detach_volume() at step (3), Libvirt tries to remove the block 
device information from an instance's 
xml file and then calls disconnect_volume() to logout an "iSCSI session".

In the disconnect_volume(), Nova checks all block device list from all VM's xml 
files at _get_all_block_devices(), and
if there are some block devices related to the "iSCSI session", Nova does not 
logout from the iSCSI session.

Normally, Libvirt removes the block device info from instance's  xml quickly, 
but in this case I found the block device
info remained at the disconnect_volume().  I suppose reboot VM via Libvirt 
affects some bad impact to remove
block device info. So the "iSCSI session" logout was skipped. As a result, the 
iSCSI session remained.

After that, if user tries to attach a volume again, Cinder creates new user and 
password and passes these to Nova.
Nova tries to use these new user and password but the old iSCSI session still 
remains, this connection is failed
and you can see many error log of iscsid in the /var/log/messages.

I think this is not only affect LVM but also other iSCSI backend
storages.

** Also affects: nova
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1440201

Title:
  reboot of a nova instance appears to leave attached volumes in a bad
  state

Status in Cinder:
  New
Status in OpenStack Compute (Nova):
  New

Bug description:
  Steps to recreate:

  1) attach a volume to a VM
  2) reboot VM
  3) detach the volume from VM,  right after the volume status changes from 
"In-use" to "available", attach the volume again,  upper error occurs and the 
volume can't be attached to the VM any more.  BTW, in step3, detach the volume 
from VM,   wait 3 minutes or so,  attach the volume again,  can be attached 
successfully.

  This is being seen in Kilo using LVM/iSCSI.

  Looking at the logs it appears that something is happening during the
  reboot that leave the iscsi volume in a bad state.  Right after the
  reboot I see:

  2015-04-03 14:42:33.319 19415 INFO nova.compute.manager 
[req-4995e5ea-c845-4b5b-a6d4-8a214e0ea87f 157e5fddcca340a2a9ec6cd5a1216b4f 
df004f3072784994870e46ee997ea70f - - -] [instance: 
0235ba32-323d-4f59-91ea-2f2f7ef2bd04] Detach volume 
3cc769bb-d585-4812-a8fd-2888fedda58d from mountpoint /dev/vdb
  2015-04-03 14:42:56.268 19415 INFO nova.compute.manager 
[req-0037930a-9d20-4b8c-a150-74b0b4411530 157e5fddcca340a2a9ec6cd5a1216b4f 
df004f3072784994870e46ee997ea70f - - -] [instance: 
0235ba32-323d-4f59-91ea-2f2f7ef2bd04] Attaching volume 
3cc769bb-d585-4812-a8fd-2888fedda58d to /dev/vdb
  2015-04-03 14:43:01.647 19415 ERROR nova.virt.libvirt.driver 
[req-0037930a-9d20-4b8c-a150-74b0b4411530 157e5fddcca340a2a9ec6cd5a1216b4f 
df004f3072784994870e46ee997ea70f - - -] [instance: 
0235ba32-323d-4f59-91ea-2f2f7ef2bd04] Failed to attach volume at mountpoint: 
/dev/vdb

  After this I start seeing the following in the logs:

  2015-04-03 14:43:03.157 19415 INFO nova.scheduler.client.report 
[req-0037930a-9d20-4b8c-a150-74b0b4411530 157e5fddcca340a2a9ec6cd5a1216b4f 
df004f3072784994870e46ee997ea70f - - -] Compute_service record updated for 
('devo-n02-kvm.rch.kstart.ibm.com', 'devo-n02-kvm.rch.kstart.ibm.com')
  2015-04-03 14:43:03.638 19415 ERROR oslo_messaging.rpc.dispatcher 
[req-0037930a-9d20-4b8c-a150-74b0b4411530 157e5fddcca340a2a9ec6cd5a1216b4f 
df004f3072784994870e46ee997ea70f - - -] Exception during message handling: The 
supplied device (vdb) is busy.
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher Traceback 
(most recent call last):
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 142, 
in _dispatch_and_reply
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     
executor_callback))
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 186, 
in _dispatch
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     
executor_callback)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 130, 
in _do_dispatch
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     result 
= func(ctxt, **new_args)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 430, in 
decorated_function
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     return 
function(self, context, *args, **kwargs)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/nova/exception.py", line 88, in wrapped
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     payload)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 82, in __exit__
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     
six.reraise(self.type_, self.value, self.tb)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/nova/exception.py", line 71, in wrapped
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     return 
f(self, context, *args, **kw)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 314, in 
decorated_function
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     
LOG.warning(msg, e, instance_uuid=instance_uuid)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 82, in __exit__
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     
six.reraise(self.type_, self.value, self.tb)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 290, in 
decorated_function
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     return 
function(self, context, *args, **kwargs)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 342, in 
decorated_function
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     
kwargs['instance'], e, sys.exc_info())
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 82, in __exit__
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     
six.reraise(self.type_, self.value, self.tb)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 330, in 
decorated_function
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     return 
function(self, context, *args, **kwargs)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4719, in 
attach_volume
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     
do_attach_volume(context, instance, driver_bdm)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 431, in 
inner
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     return 
f(*args, **kwargs)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 82, in __exit__
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     
six.reraise(self.type_, self.value, self.tb)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4714, in 
do_attach_volume
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     return 
self._attach_volume(context, instance, driver_bdm)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4737, in 
_attach_volume
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     
self.volume_api.unreserve_volume(context, bdm.volume_id)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 82, in __exit__
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     
six.reraise(self.type_, self.value, self.tb)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4729, in 
_attach_volume
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     
do_check_attach=False, do_driver_attach=True)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 48, in 
wrapped
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     ret_val 
= method(obj, context, *args, **kwargs)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 258, in 
attach
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     
connector)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 82, in __exit__
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     
six.reraise(self.type_, self.value, self.tb)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 249, in 
attach
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     
device_type=self['device_type'], encryption=encryption)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1062, in 
attach_volume
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher     raise 
exception.DeviceIsBusy(device=disk_dev)
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher 
DeviceIsBusy: The supplied device (vdb) is busy.
  2015-04-03 14:43:03.638 19415 TRACE oslo_messaging.rpc.dispatcher

  
  After this the update of resources  start failing:

  2015-04-03 14:45:02.507 19415 ERROR nova.openstack.common.periodic_task 
[req-fe14e38c-8f29-46b0-884b-3b24f7ff0397 - - - - -] Error during 
ComputeManager.update_available_resource: Unexpected error while running 
command.
  Command: sudo nova-rootwrap /etc/nova/rootwrap.conf blockdev --getsize64 
/dev/disk/by-path/ip-9.114.170.202:3260-iscsi-iqn.2010-10.org.openstack:volume-3cc769bb-d585-4812-a8fd-2888fedda58d-lun-0
  Exit code: 1
  Stdout: u''
  Stderr: u'blockdev: cannot open 
/dev/disk/by-path/ip-9.114.170.202:3260-iscsi-iqn.2010-10.org.openstack:volume-3cc769bb-d585-4812-a8fd-2888fedda58d-lun-0:
 No such device or address\n'
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task 
Traceback (most recent call last):
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task   
File "/usr/lib/python2.7/site-packages/nova/openstack/common/periodic_task.py", 
line 224, in run_periodic_tasks
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task     
task(self, context)
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task   
File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6166, in 
update_available_resource
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task     
rt.update_available_resource(context)
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task   
File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 
317, in update_available_resource
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task     
resources = self.driver.get_available_resource(self.nodename)
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task   
File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4872, 
in get_available_resource
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task     
disk_over_committed = self._get_disk_over_committed_size_total()
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task   
File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6054, 
in _get_disk_over_committed_size_total
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task     
self._get_instance_disk_info(dom.name(), xml))
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task   
File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6007, 
in _get_instance_disk_info
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task     
dk_size = lvm.get_volume_size(path)
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task   
File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/lvm.py", line 172, in 
get_volume_size
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task     
run_as_root=True)
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task   
File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/utils.py", line 55, in 
execute
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task     
return utils.execute(*args, **kwargs)
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task   
File "/usr/lib/python2.7/site-packages/nova/utils.py", line 206, in execute
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task     
return processutils.execute(*cmd, **kwargs)
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task   
File "/usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py", line 
224, in execute
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task     
cmd=sanitized_cmd)
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task 
ProcessExecutionError: Unexpected error while running command.
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task 
Command: sudo nova-rootwrap /etc/nova/rootwrap.conf blockdev --getsize64 
/dev/disk/by-path/ip-9.114.170.202:3260-iscsi-iqn.2010-10.org.openstack:volume-3cc769bb-d585-4812-a8fd-2888fedda58d-lun-0
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task Exit 
code: 1
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task 
Stdout: u''
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task 
Stderr: u'blockdev: cannot open 
/dev/disk/by-path/ip-9.114.170.202:3260-iscsi-iqn.2010-10.org.openstack:volume-3cc769bb-d585-4812-a8fd-2888fedda58d-lun-0:
 No such device or address\n'
  2015-04-03 14:45:02.507 19415 TRACE nova.openstack.common.periodic_task

  The file under /dev/disk/by-path/ exists but but you get the 'No such
  device or address' error when attempting to access it.

  In /var/log/messages I see this:

  Apr  3 14:43:00 devo-n02-kvm kernel: [731664.038599]  connection8:0: detected 
conn error (1020)
  Apr  3 14:43:01 devo-n02-kvm iscsid: connection18:0 is operational after 
recovery (7 attempts)
  Apr  3 14:43:01 devo-n02-kvm iscsid: conn 0 login rejected: initiator failed 
authorization with target
  Apr  3 14:43:01 devo-n02-kvm libvirtd[61554]: operation failed: target vdb 
already exists
  Apr  3 14:43:02 devo-n02-kvm kernel: [731665.358832]  connection18:0: 
detected conn error (1020)
  Apr  3 14:43:03 devo-n02-kvm iscsid: Kernel reported iSCSI connection 18:0 
error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
  Apr  3 14:43:04 devo-n02-kvm iscsid: conn 0 login rejected: initiator failed 
authorization with target
  Apr  3 14:43:04 devo-n02-kvm kernel: [731667.948507]  connection18:0: 
detected conn error (1020)

  I am wondering if this might be related to this bug:
  https://bugs.dogfood.paddev.net/nova/+bug/1367189  Are the chap
  credentials lost with the reboot or something like that?

To manage notifications about this bug go to:
https://bugs.launchpad.net/cinder/+bug/1440201/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to