Public bug reported: When a large number of subnets and FIPs are configured on a network, the response time for neutron.api.rpc.handlers.l3_rpc.syncer_routers increases significantly.
Based on profiling data, a large amount of time is spent waiting on _get_sync_floating_ips (https://opendev.org/openstack/neutron/src/commit/0a214b0437874fd7f5379ec94fd07ef5d3ff4bbe/neutron/db/l3_db.py#L1879). ncalls tottime percall cumtime percall filename:lineno(function) ...TRUNCATED... 16 2 0.000 0.000 19.827 9.913 /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/db/l3_db.py:1873(_get_sync_floating_ips) ...TRUNCATED... In the above example, the total execution time logged for sync_routers was 26.645s. Further investigation reveals that the call to l3_obj.FloatingIP.get_scoped_floating_ips within this is spending a large amount of time mapping SQL output to ORM Python objects. Reproduction steps: - Setup OpenStack with DVR enabled - Create a network - Attach a large number of subnets (the above has 27) - Create a large number of FIPs and attach them to VMs (the above has around 1000 attached FIPs) - Restart neutron_l3_agent on a compute node and observe slow response times for the get_routers() RPC Version: - OpenStack: Zed - Kernel/distro: N/A ** Affects: neutron Importance: Undecided Assignee: Adam (adamoswick) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2009509 Title: Large number of FIPs and subnets causes slow sync_routers response Status in neutron: In Progress Bug description: When a large number of subnets and FIPs are configured on a network, the response time for neutron.api.rpc.handlers.l3_rpc.syncer_routers increases significantly. Based on profiling data, a large amount of time is spent waiting on _get_sync_floating_ips (https://opendev.org/openstack/neutron/src/commit/0a214b0437874fd7f5379ec94fd07ef5d3ff4bbe/neutron/db/l3_db.py#L1879). ncalls tottime percall cumtime percall filename:lineno(function) ...TRUNCATED... 16 2 0.000 0.000 19.827 9.913 /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/db/l3_db.py:1873(_get_sync_floating_ips) ...TRUNCATED... In the above example, the total execution time logged for sync_routers was 26.645s. Further investigation reveals that the call to l3_obj.FloatingIP.get_scoped_floating_ips within this is spending a large amount of time mapping SQL output to ORM Python objects. Reproduction steps: - Setup OpenStack with DVR enabled - Create a network - Attach a large number of subnets (the above has 27) - Create a large number of FIPs and attach them to VMs (the above has around 1000 attached FIPs) - Restart neutron_l3_agent on a compute node and observe slow response times for the get_routers() RPC Version: - OpenStack: Zed - Kernel/distro: N/A To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/2009509/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

