Reviewed: https://review.opendev.org/c/openstack/neutron/+/827834 Committed: https://opendev.org/openstack/neutron/commit/be7331c8169c53e3900c9c1a08e12808cf5ed2ec Submitter: "Zuul (22348)" Branch: master
commit be7331c8169c53e3900c9c1a08e12808cf5ed2ec Author: Daniel Alvarez Sanchez <[email protected]> Date: Fri Feb 4 11:32:47 2022 +0100 [ovn] Prevent stale ports in the OVN database Under a lot of load, there can be situations where all the Neutron workers have not updated their in-memory copy of the NB database in time before certain operations. This scenario can lead to stale resources when a somewhat recently created port is attempted to be deleted, but the worker handling this deletion doesn't know about the OVN port yet. This patch detects this condition and allows some time (at least one maintenance task cycle) before it deletes the OVN revision number. If the port then shows up in the OVN database within that window, then it will be deleted later by the maintenance task avoiding the stale ports. If not, the revision number row will be deleted and we won't stale these entries either. Closes-Bug: #1960006 Signed-off-by: Daniel Alvarez Sanchez <[email protected]> Change-Id: Ie4093dc6cd63b89e3a62363a4f805ef8287d15b9 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1960006 Title: [ovn] Stale ports in the OVN database at churn Status in neutron: Fix Released Bug description: There are situations where, under a lot of control plane activity, OVN ports will stale and won't get cleaned up (unless the neutron-ovn-db- sync tool is run manually). A possible scenario for this is: a) Port creation a.1) Port created in Neutron DB a.b) Port created in OVN Northbound (NB) database. a.c) NB ovsdb-server will notify of the port creation to all the connected workers a.d) Each worker will eventually process this event and update their in-memory copy of the NB database Immediately, the port gets deleted via API but the previous a.d) step hasn't been completed by all workers. Then the port deletion API request falls into one of those workers that haven't yet updated their in-memory OVN NB database copy with the newly created port. b) Port deletion b.1) Port deleted from Neutron DB b.2) Port attempted to be deleted from OVN NB but lookup fails and its revision number is deleted [0] At this point, the port will stale forever in the OVN database causing other issues that we have mitigated (eg. [1]) but ultimately the number of OVN resources may grow to a point that can affect very negatively to the overall cluster stability and performance. A potential workaround to this problem might be to run the neutron- ovn-db-sync tool periodically to get rid of those but it is not recommended to do so while the API is operational. [0] https://github.com/openstack/neutron/blob/f5030b0bc25216d80b09f7ac3938c9a902b655e3/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py#L698 [1] https://bugs.launchpad.net/neutron/+bug/1874733 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1960006/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

