Public bug reported:

The scenario is a simple one: during the creation of a router, the
server that deals with the request crashes after creating the router
with the ALLOCATING state [1] but before it's changed to ACTIVE [2]. In
this case, the router will be "stuck" in the ALLOCATING and the only
admin action to change the router back to ACTIVE (and allow it to be
scheduled to agents) is:

1. set admin-state-up to False
2. set ha to False
3. set ha to True
4. set admin-state-up to True

That is, a full migration of the HA router to legacy and back to HA is
required. This will trigger the code in [3] and will fix this issue.

The proposed solution is to add a new state, such that if admin-state-up
is changed to False then the router's status will be changed to "DOWN"
(as opposed to the current "ACTIVE", which doesn't make much sense since
admin-state-up is False).

[1]: 
https://github.com/openstack/neutron/blob/ff5b38071e7e134baa0dc7a52280f9bcbc06efaf/neutron/db/l3_hamode_db.py#L469
[2]: 
https://github.com/openstack/neutron/blob/ff5b38071e7e134baa0dc7a52280f9bcbc06efaf/neutron/db/l3_hamode_db.py#L485
[3]: 
https://github.com/openstack/neutron/blob/ff5b38071e7e134baa0dc7a52280f9bcbc06efaf/neutron/db/l3_hamode_db.py#L570

** Affects: neutron
     Importance: Undecided
     Assignee: John Schwarz (jschwarz)
         Status: New


** Tags: l3-ha

** Changed in: neutron
     Assignee: (unassigned) => John Schwarz (jschwarz)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1609738

Title:
  l3-ha: a router can be stuck in the ALLOCATING state

Status in neutron:
  New

Bug description:
  The scenario is a simple one: during the creation of a router, the
  server that deals with the request crashes after creating the router
  with the ALLOCATING state [1] but before it's changed to ACTIVE [2].
  In this case, the router will be "stuck" in the ALLOCATING and the
  only admin action to change the router back to ACTIVE (and allow it to
  be scheduled to agents) is:

  1. set admin-state-up to False
  2. set ha to False
  3. set ha to True
  4. set admin-state-up to True

  That is, a full migration of the HA router to legacy and back to HA is
  required. This will trigger the code in [3] and will fix this issue.

  The proposed solution is to add a new state, such that if admin-state-
  up is changed to False then the router's status will be changed to
  "DOWN" (as opposed to the current "ACTIVE", which doesn't make much
  sense since admin-state-up is False).

  [1]: 
https://github.com/openstack/neutron/blob/ff5b38071e7e134baa0dc7a52280f9bcbc06efaf/neutron/db/l3_hamode_db.py#L469
  [2]: 
https://github.com/openstack/neutron/blob/ff5b38071e7e134baa0dc7a52280f9bcbc06efaf/neutron/db/l3_hamode_db.py#L485
  [3]: 
https://github.com/openstack/neutron/blob/ff5b38071e7e134baa0dc7a52280f9bcbc06efaf/neutron/db/l3_hamode_db.py#L570

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1609738/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to