Reviewed:  https://review.opendev.org/c/openstack/neutron/+/853281
Committed: 
https://opendev.org/openstack/neutron/commit/5b4ed5b117f2f418d598af20744f571db581e2ae
Submitter: "Zuul (22348)"
Branch:    master

commit 5b4ed5b117f2f418d598af20744f571db581e2ae
Author: Bodo Petermann <[email protected]>
Date:   Tue Aug 16 14:14:14 2022 +0200

    Fix concurrent port binding activate
    
    Fix an issue with concurrent requests to activate a port binding.
    If there are two activate requests in parallel, one might set the
    binding on the new host to active and the other request may
    not find the previously INACTIVE row anymore in
    _commit_port_binding and initializing the driver_context.PortContext
    crashed.
    
    Closes-Bug: #1986003
    Change-Id: I047e33062bc38f36848e0149c6e670cb5828c8e3


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1986003

Title:
  Exception in concurrent port binding activation

Status in neutron:
  Fix Released

Bug description:
  Occasionally VM live-migrations fail in post-migration because the request to 
activate the port binding on the new host fails with a 500 Internal Server 
Error.
  It appears that nova-compute might try two requests in parallel. One of them 
succeeds, the other one returns the error.

  Neutron version: yoga, 20.1.0

  How to reproduce:

  - create a port for a compute instance, with a binding to host host1
  - create an additional port binding for host2, i.e. POST 
/v2.0/ports/{port_id}/bindings
  - that will create the new binding with status=INACTIVE
  - activate the port binding with 2 requests in parallel (2 times PUT 
/v2.0/ports/{port_id}/bindings/host2/activate)

  Actual result:

  - one PUT request returns 200
  - other PUT request returns 500

  In neutron-server log the failed request logs an exception: 
"sqlalchemy.orm.exc.UnmappedInstanceError: Class 'builtins.NoneType' is not 
mapped."
  See https://paste.opendev.org/show/bFICeriQTlkmVwYQ5nzo/

  Expected result:

  - one PUT request returns 200
  - other PUT request returns 409 (port binding already active)

  Background:

  Nova live-migrations may trigger such concurrent activate requests.
  In preparation of the live-migration nova will create a new port binding for 
the destination host. When the migration completes it will activate that 
binding. At least in our setup that activation may be triggered from two 
places: (a) when the lifecycle event about completed migration is handled and 
(b) when the migration job monitor actively detects that the migration 
completed. If the 2nd one fails, the post-live-migration breaks and the whole 
migration goes into error state and may not finish all its work.

  Related bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2097160

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1986003/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to