Public bug reported:

# Summary
During the integration tests, currently if SSH to instance times out it holds 
up testing for over an hour in an attempt to SSH to an instance; note the 
timestamp jump on: https://paste.ubuntu.com/p/NBQKwm9wdG/

The _ssh_connect function was originally written for the nocloud_kvm
platform and used as a method for determining if an instance was up and
accessible. As such, the function is doing double duty and not correctly
focused on SSH'ing to an up and running instance and has a bug in it as
it is waiting far too long.

# Action plan

1. For the nocloud_kvm platform when when starting and before
_wait_for_system, there should be a check if an instance is accessible
during the is_running check. This could be done again by SSH with a
number of retries, but should be taken care of inside the nocloud_kvm
platform itself and not in the SSH connect function.

2. Update the _ssh_connect to timeout quickly, reduce wait on banner,
and only retry up to 3 times.

# Noted Files
tests/cloud_tests/platforms/platforms.py:_ssh_connect()
tests/cloud_tests/platforms/nocloudkvm/instance.py:start()

** Affects: cloud-init
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1758409

Title:
  integration tests: restructure ssh timeout

Status in cloud-init:
  New

Bug description:
  # Summary
  During the integration tests, currently if SSH to instance times out it holds 
up testing for over an hour in an attempt to SSH to an instance; note the 
timestamp jump on: https://paste.ubuntu.com/p/NBQKwm9wdG/

  The _ssh_connect function was originally written for the nocloud_kvm
  platform and used as a method for determining if an instance was up
  and accessible. As such, the function is doing double duty and not
  correctly focused on SSH'ing to an up and running instance and has a
  bug in it as it is waiting far too long.

  # Action plan

  1. For the nocloud_kvm platform when when starting and before
  _wait_for_system, there should be a check if an instance is accessible
  during the is_running check. This could be done again by SSH with a
  number of retries, but should be taken care of inside the nocloud_kvm
  platform itself and not in the SSH connect function.

  2. Update the _ssh_connect to timeout quickly, reduce wait on banner,
  and only retry up to 3 times.

  # Noted Files
  tests/cloud_tests/platforms/platforms.py:_ssh_connect()
  tests/cloud_tests/platforms/nocloudkvm/instance.py:start()

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1758409/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to