Reviewed: https://review.opendev.org/c/openstack/neutron/+/782570 Committed: https://opendev.org/openstack/neutron/commit/d7f68a0ce76ffb9a93dfba167dfffba53189350d Submitter: "Zuul (22348)" Branch: master
commit d7f68a0ce76ffb9a93dfba167dfffba53189350d Author: Edward Hope-Morley <[email protected]> Date: Tue Mar 23 17:18:48 2021 +0000 Set proxy_delay to one when using proxy ARP Neutron DVR uses proxy ARP in fip namespaces to respond to ARP requests for instance floating IPs. In doing so it is susceptible to a random delay of up to (by default) 800ms which is added to the time taken to respond to ARP requests. This causes an initial delay to ARP reponses that is entirely avoidable by changing this parameter to one, instead of the default, to make it as short as possible. NOTE: Setting this to zero is actually undefined and will cause the kernel to choose a random delay from 0 to U32_MAX so is not advised. Gleaned from this comment in __get_random_u32_below(), which is eventually called from pneigh_enqueue(): /* * This function is technically undefined for ceil == 0, and in fact * for the non-underscored constant version in the header, we build bug * on that. But for the non-constant case, it's convenient to have that * evaluate to being a straight call to get_random_u32(), so that * get_random_u32_inclusive() can work over its whole range without * undefined behavior. */ Will propose a kernel change to fix this but cannot assume it will be in a distro kernel for a while. Change-Id: I0dc65b17ef436a97d0fcbd164d124ec59a1b2797 Closes-Bug: #1920975 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1920975 Title: neutron dvr should lower proxy_delay when using proxy_arp Status in OpenStack Neutron Open vSwitch Charm: New Status in neutron: Fix Released Bug description: Neutron DVR uses proxy_arp in fip namespaces to respond to arp requests for instance floating ips. In doing so it is susceptible to a random delay up to by default 800ms which is added to the time taken to respond to an arp request that has to be proxied i.e. # ip netns exec fip-a297543b-9ef9-4bd5-b1ca-e85a726c1726 sysctl net.ipv4.{conf.fg-51f3e07b-2d.proxy_arp,neigh.fg-51f3e07b-2d.proxy_delay} net.ipv4.conf.fg-51f3e07b-2d.proxy_arp = 1 net.ipv4.neigh.fg-51f3e07b-2d.proxy_delay = 80 The result of this is seen when e.g. you ping a vm fip and the first request takes significantly longer than subsequent requests: $ ping -c 5 10.5.150.90 PING 10.5.150.90 (10.5.150.90) 56(84) bytes of data. 64 bytes from 10.5.150.90: icmp_seq=1 ttl=60 time=491 ms 64 bytes from 10.5.150.90: icmp_seq=2 ttl=60 time=1.08 ms 64 bytes from 10.5.150.90: icmp_seq=3 ttl=60 time=1.39 ms 64 bytes from 10.5.150.90: icmp_seq=4 ttl=60 time=1.16 ms 64 bytes from 10.5.150.90: icmp_seq=5 ttl=60 time=1.03 ms --- 10.5.150.90 ping statistics --- 5 packets transmitted, 5 received, 0% packet loss, time 4007ms rtt min/avg/max/mdev = 1.034/99.157/491.134/195.988 ms To repro again simply delete arp entry for fip from fip ns of source compute host. By kernel standards this behaviour is by-design when using the default settings but some workloads may be impacted by this initial delay especially e.g. in loaded environments where the arp caches are under strain and hitting gc_thresh limits. To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1920975/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

