Public bug reported: We are having some instability in our Openstack Wallaby systems right now, and have found that even though passing the "retries" option cloud- init openstack datasource, the code might fail also on a first attempt to detect alive metadata services as it's not retrying at that point (only when trying to fetch the data itself).
Some logs of the error: ``` 2022-06-16 03:25:35,449 - url_helper.py[DEBUG]: [0/1] open 'http://169.254.169.254/openstack' with {'url': 'http://169.254.169.254/openstack', 'allow_redirects': True, 'method': 'GET', 'timeou t': 10.0, 'headers': {'User-Agent': 'Cloud-Init/20.4.1'}} configuration 2022-06-16 03:25:45,469 - url_helper.py[DEBUG]: Calling 'http://169.254.169.254/openstack' failed [10/-1s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Read timed out. (read timeout=10.0)] 2022-06-16 03:25:45,471 - DataSourceOpenStack.py[DEBUG]: Giving up on OpenStack md from ['http://169.254.169.254/openstack'] after 10 seconds 2022-06-16 03:25:45,471 - util.py[WARNING]: No active metadata service found 2022-06-16 03:25:45,477 - util.py[DEBUG]: No active metadata service found Traceback (most recent call last): File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceOpenStack.py", line 145, in _get_data results = self._crawl_metadata() File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceOpenStack.py", line 181, in _crawl_metadata raise sources.InvalidMetaDataException( cloudinit.sources.InvalidMetaDataException: No active metadata service found ``` We think that it should retry to fetch also on this first pass (we are looking also on the instability sources, but this would help make it more resilient). ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1979049 Title: DataSourceOpenstack: Retry when waiting for the metadata service too Status in cloud-init: New Bug description: We are having some instability in our Openstack Wallaby systems right now, and have found that even though passing the "retries" option cloud-init openstack datasource, the code might fail also on a first attempt to detect alive metadata services as it's not retrying at that point (only when trying to fetch the data itself). Some logs of the error: ``` 2022-06-16 03:25:35,449 - url_helper.py[DEBUG]: [0/1] open 'http://169.254.169.254/openstack' with {'url': 'http://169.254.169.254/openstack', 'allow_redirects': True, 'method': 'GET', 'timeou t': 10.0, 'headers': {'User-Agent': 'Cloud-Init/20.4.1'}} configuration 2022-06-16 03:25:45,469 - url_helper.py[DEBUG]: Calling 'http://169.254.169.254/openstack' failed [10/-1s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Read timed out. (read timeout=10.0)] 2022-06-16 03:25:45,471 - DataSourceOpenStack.py[DEBUG]: Giving up on OpenStack md from ['http://169.254.169.254/openstack'] after 10 seconds 2022-06-16 03:25:45,471 - util.py[WARNING]: No active metadata service found 2022-06-16 03:25:45,477 - util.py[DEBUG]: No active metadata service found Traceback (most recent call last): File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceOpenStack.py", line 145, in _get_data results = self._crawl_metadata() File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceOpenStack.py", line 181, in _crawl_metadata raise sources.InvalidMetaDataException( cloudinit.sources.InvalidMetaDataException: No active metadata service found ``` We think that it should retry to fetch also on this first pass (we are looking also on the instability sources, but this would help make it more resilient). To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1979049/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

