Reviewed: https://review.openstack.org/559447 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6647f11dc1aba89f9b0e2703f236a43f31d88079 Submitter: Zuul Branch: master
commit 6647f11dc1aba89f9b0e2703f236a43f31d88079 Author: Matt Riedemann <[email protected]> Date: Fri Apr 6 20:28:53 2018 -0400 Don't persist RequestSpec.retry During a resize, the RequestSpec.flavor is updated to the new flavor. If the resize failed on one host and was rescheduled, the RequestSpec.retry is updated for that failed host and mistakenly persisted, which can affect later move operations, like if an admin targets one of those previously failed hosts for a live migration or evacuate operation. This change fixes the problem by not ever persisting the RequestSpec.retry field to the database, since retries are per-request/operation and not something that needs to be persisted. Alternative to this, we could reset the retry field in the RequestSpec.reset_forced_destinations method but that would be slightly overloading the meaning of that method, and the approach taken in this patch is arguably cleaner since retries shouldn't ever be persisted. It should be noted, however, that one advantage to resetting the 'retry' field in the RequestSpec.reset_forced_destinations method would be to avoid this issue for any existing DB entries that have this problem. The related functional regression test is updated to show the bug is now fixed. Change-Id: Iadbf8ec935565a6d4ccf6f36ef630ab6bf1bea5d Closes-Bug: #1718512 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1718512 Title: migration fails if instance build failed on destination host Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) ocata series: Confirmed Status in OpenStack Compute (nova) pike series: In Progress Status in OpenStack Compute (nova) queens series: In Progress Bug description: (OpenStack Nova, commit d8b30c3772, per OSA-14.2.7) if an instance build fails on a hypervisor the "retry" field of the instance's request spec is populated with which host and how many times it attempted to retry the build. this field remains populated during the life-time of the instance. if a live-migration for the same instance is requested, the conductor loads this request spec and passes it on to the scheduler. the scheduler will fail the migration request on RetryFilter since the target was already known to have failed (albeit, for the build). with the help of mriedem and melwitt of #openstack-nova, we determined that migration retries are handled separately from build retries. mriedem suggested a patch to ignore the retry field of the instance request spec during migrations. this patch allowed the failing migration to succeed. it is important to note that it may fail the migration again, however there is still sufficient reason to ignore the build's failures/retries during a migration. 12:55 < mriedem> it does stand to reason that if this instance failed to build originally on those 2 hosts, that live migrating it there might fail too...but we don't know why it originally failed, could have been a resource claim issue at the time 12:58 < melwitt> yeah, often it's a failed claim. and also what if that compute host is eventually replaced over the lifetime of the cluster, making it a fresh candidate for several instances that might still avoid it because they once failed to build there back when it was a different machine To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1718512/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

