I have used my lab to understand original report and what is actually going on. Shortly speaking, Nova is unable to live migrate an instance because Neutron fails to create port binding on destination host (it already exists) when Nova Conductor tries to create port binding on destination host.
There could be multiple causes of such behavior and this bug is not about solving a root cause. Instead, the problem is that it is hard to isolate the issue using Neutron Server logs. When debug is enabled Neutron generates logs [1] for such kinds of requests. It looks like for some reason the following exception raised by Neutron is not logged properly on Neutron side: we can see it in Nova logs, but not in Neutron Server logs. https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/plugin.py#L2463-L2466 As a result, it looks like this problem should be solved by improving logging for Neutron extensions in general or this particular function. With that being said, I am changing affected product to Neutron and unassigning this bug. [1] 2021-10-18 14:15:58.081 18 DEBUG neutron.api.v2.base [req-c237e714-0085-47c6-abab-e7e15cec7ea1 d79d0393fbb04564bc8fd3c62d290087 b8715d57125f4787a6701319d38f61e3 - default default] Request body: {'binding': {'host': 'compute-0.redhat.local', 'vnic_type': 'normal', 'profile': {}}} prepare_request_body /usr/lib/python3.6/site-packages/neutron/api/v2/base.py:719 2021-10-18 14:15:58.081 18 DEBUG neutron.api.v2.base [req-c237e714-0085-47c6-abab-e7e15cec7ea1 d79d0393fbb04564bc8fd3c62d290087 b8715d57125f4787a6701319d38f61e3 - default default] Unknown quota resources ['binding']. _create /usr/lib/python3.6/site-packages/neutron/api/v2/base.py:490 2021-10-18 14:15:58.135 18 INFO neutron.api.v2.resource [req-c237e714-0085-47c6-abab-e7e15cec7ea1 d79d0393fbb04564bc8fd3c62d290087 b8715d57125f4787a6701319d38f61e3 - default default] create failed (client error): There was a conflict when trying to complete your request. 2021-10-18 14:15:58.137 18 INFO neutron.wsgi [req-c237e714-0085-47c6-abab-e7e15cec7ea1 d79d0393fbb04564bc8fd3c62d290087 b8715d57125f4787a6701319d38f61e3 - default default] 172.17.1.17 "POST /v2.0/ports/7542a977-0586-423a-ae35-86e3ff791060/bindings HTTP/1.1" status: 409 len: 364 time: 0.3800023 2021-10-18 14:15:58.316 21 DEBUG neutron_lib.callbacks.manager [req-334ebddd-4e81-4c12-829c-64f3b0a278ff - - - - -] Notify callbacks ['neutron.services.segments.db._update_segment_host_mapping_for_agent-8793714910767', 'neutron.plugins.ml2.plugin.Ml2Plugin._retry_binding_revived_agents-16758855'] for agent, after_update _notify_loop /usr/lib/python3.6/site-packages/neutron_lib/callbacks/manager.py:193 ** Changed in: nova Assignee: Alexey Stupnikov (astupnikov) => (unassigned) ** Project changed: nova => neutron ** Changed in: neutron Status: Triaged => New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1849802 Title: Live-migration fails without a logged root cause Status in neutron: New Bug description: Operating system distribution and version: Ubuntu 18.04 Neutron package version: 2:14.0.2-0ubuntu1~cloud0 Nova package version: 2:19.0.1-0ubuntu2.1~cloud0 Cloud was deploying (using Juju charms) as rocky, then upgraded to stein. There are a number of instances that I need to migrate from one compute node to another, but: $ openstack server migrate --block-migration 8703d9db-81b0-4e86-a2ef-c4ba5250556c --live shinx --disk-overcommit Migration pre-check error: Binding failed for port 5a3c5d23-8727-47d2-af72-a53b495358d2, please check neutron logs for more information. (HTTP 400) (Request-ID: req-7c41ae70-6f5b-48a8-9d09-add2bbbe2b7e) $ However, even with debug logging enabled, all that shows up in the neutron-api logs is: 2019-10-25 09:34:12.147 1569534 INFO neutron.wsgi [req- ac358ed5-cfec-4618-b765-f2defd5a3aac 92e98c5c687a46d29ec28aca3025f3da 7555fff7e7eb4a0eb28149905b266a2b - 207337407e3647798c0f68a0a28a0f8b 207337407e3647798c0f68a0a28a0f8b] 10.x.y.z,127.0.0.1 "POST /v2.0/ports/5a3c5d23-8727-47d2-af72-a53b495358d2/bindings HTTP/1.1" status: 409 len: 371 time: 0.1632745 Which suggests that for some reason, the API call to retrieve port bindings is failing, but there's no further information in the logs for me to debug exactly why. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1849802/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

