Public bug reported: OVN metadata agent can take very long time (observed ~40s) to add cidrs under a metadata namespace tap interface when a network consist of many subnets (observed ~1700 subnets). The long processing time can result in ovn-metada-agent not having haproxy ready by the time the first VM cloud-init requests for its metadata. Thus resulting in VM missing metadata for proper operation.
Reproducing step: - Create a network with thousands of subnets under this network - Create a VM connected to the network from above. Make sure this is the first VM on the deployed compute node(hypervisor). Observe that VM's cloud-init request time out due to no response from 169.256.169.256/openstack - Observe that ovn-metadata-agent logs is probably still executing or was executing this code [1] Possible solutions: 1. (Long hanging fruit?) See if there is a way to improve execution time of `ip.add` call. Perhaps passing a list of cidrs instead of a single cidr at the time can improve performance? 2. (more involved) refactor the code such that ovn-metadata-agent only adds a single cidr which belongs to the VM being created. Instead of unconditionally adding all cidrs for the network when the first VM is created(current implementation) [1] https://github.com/openstack/neutron/blob/41bf8054017c72815226d5df50fd321b30fcba13/neutron/agent/ovn/metadata/agent.py#L488-L495 ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1981113 Title: OVN metadata agent can be slow with large amount of subnets Status in neutron: New Bug description: OVN metadata agent can take very long time (observed ~40s) to add cidrs under a metadata namespace tap interface when a network consist of many subnets (observed ~1700 subnets). The long processing time can result in ovn-metada-agent not having haproxy ready by the time the first VM cloud-init requests for its metadata. Thus resulting in VM missing metadata for proper operation. Reproducing step: - Create a network with thousands of subnets under this network - Create a VM connected to the network from above. Make sure this is the first VM on the deployed compute node(hypervisor). Observe that VM's cloud-init request time out due to no response from 169.256.169.256/openstack - Observe that ovn-metadata-agent logs is probably still executing or was executing this code [1] Possible solutions: 1. (Long hanging fruit?) See if there is a way to improve execution time of `ip.add` call. Perhaps passing a list of cidrs instead of a single cidr at the time can improve performance? 2. (more involved) refactor the code such that ovn-metadata-agent only adds a single cidr which belongs to the VM being created. Instead of unconditionally adding all cidrs for the network when the first VM is created(current implementation) [1] https://github.com/openstack/neutron/blob/41bf8054017c72815226d5df50fd321b30fcba13/neutron/agent/ovn/metadata/agent.py#L488-L495 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1981113/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

