Reviewed: https://review.opendev.org/705764 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6458c3dba53b9a9fb903bdb6e5e08af14ad015d6 Submitter: Zuul Branch: master
commit 6458c3dba53b9a9fb903bdb6e5e08af14ad015d6 Author: Sasha Andonov <[email protected]> Date: Tue Feb 4 16:59:14 2020 +0100 rbd_utils: increase _destroy_volume timeout If RBD backend is used for Nova ephemeral storage, Nova tries to remove ephemeral storage volume from Ceph in a retry loop: 10 attempts at 1 second intervals, totaling 10 seconds overall - which, due to a thirty second ceph watcher timeout, might result in intermittent volume removal failures on Ceph side. This patch adds params rbd_destroy_volume_retries, defaulting to 12, and rbd_destroy_volume_retry_interval, defaulting to 5, which multiplied, give Ceph reasonable amount of time to complete the operation successfully. Closes-Bug: #1856845 Change-Id: Icfd55617f0126f79d9610f8a2fc6b4c817d1a2bd ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1856845 Title: Ephemeral storage removal fails with message rbd remove failed Status in OpenStack Compute (nova): Fix Released Bug description: Description =========== After destroying instances, ephemeral storage removal intermittently fails with message: 2019-10-17 11:21:08.122 398018 INFO nova.virt.libvirt.driver [-] [instance: 87096add-348e-4c94-8f31-066346e32eef] Instance destroyed successfully. 2019-10-17 11:21:14.619 398018 WARNING nova.virt.libvirt.storage.rbd_utils [-] rbd remove 87096add-348e-4c94-8f31-066346e32eef_disk in pool rbd_pool failed Ceph logs report lossy connection error: 2019-10-17 11:21:06.181233 7fbbdf2f4700 0 -- 10.248.83.92:6808/20526 submit_message osd_op_reply(192922 rbd_data.77c63845d27cdd.0000000000004728 [stat,set-alloc-hint object_size 4194304 write_size 4194304,write 1273856~262144] v1504399'62984460 uv62984460 ack = 0) v7 remote, 10.248.54.216:0/2391175308, failed lossy con, dropping message 0x56545f021e40 Steps to reproduce ================== - Deploy Nova with Ceph ephemeral storage RBD - Create an instance - Destroy an instance Expected result =============== Nova instance destroyed, ceph ephemeral storage always removed from pool Actual result ============= Nova instance destroyed, ceph ephemeral storage sometimes remains in pool To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1856845/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

