Public bug reported:
Before HA rouer is used by agent,
1) HA network should be created
2) vr_id has to be allocated
3) HA router should able to create sufficient number of ports on HA network
If scheduler(from rpc worker) process the HA router(as router is available in
DB) before these resources are created, then the following races(between api
and rpc workers) can happen
1) Race for creating HA network
2) vr_id not avialable for agent, so can't spawn HA proxy process
3) If creating router ports in api worker is failed, router is deleted. So rpc
worker will have races as router is deleted while it is binding router's ha
ports to agent.
To avoid this, l3 scheduler should skip this router(while syncing for the
agent) if above resources are not yet created.
To facilitate this, new status("ALLOCATING") is proposed for HA router in
https://review.openstack.org/#/c/257059/
In this patch, first router is created and set status as ALLOCATING. And once
all the above resources are created, its status is changed back to ACTIVE.
Added proper checks(in the code) to skip using Router if it's status is
ALLOCATING.
So with this patch
1) we are creating a new router status
2) carefully identify where router can be accessed before its resources are
created.
3) How code behaves(during its acess to router) when status transitioned from
ALLOCATING to ACTIVE
Alternatively, if we are able to create HA router's resources before HA router
creation, we can avoid a new status and new checks, but same functionality as
https://review.openstack.org/#/c/257059/.
** Affects: neutron
Importance: Undecided
Assignee: venkata anil (anil-venkata)
Status: New
** Changed in: neutron
Assignee: (unassigned) => venkata anil (anil-venkata)
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1566194
Title:
Make sure resources for HA router exists before the router creation
Status in neutron:
New
Bug description:
Before HA rouer is used by agent,
1) HA network should be created
2) vr_id has to be allocated
3) HA router should able to create sufficient number of ports on HA network
If scheduler(from rpc worker) process the HA router(as router is available in
DB) before these resources are created, then the following races(between api
and rpc workers) can happen
1) Race for creating HA network
2) vr_id not avialable for agent, so can't spawn HA proxy process
3) If creating router ports in api worker is failed, router is deleted. So
rpc worker will have races as router is deleted while it is binding router's ha
ports to agent.
To avoid this, l3 scheduler should skip this router(while syncing for the
agent) if above resources are not yet created.
To facilitate this, new status("ALLOCATING") is proposed for HA router in
https://review.openstack.org/#/c/257059/
In this patch, first router is created and set status as ALLOCATING. And once
all the above resources are created, its status is changed back to ACTIVE.
Added proper checks(in the code) to skip using Router if it's status is
ALLOCATING.
So with this patch
1) we are creating a new router status
2) carefully identify where router can be accessed before its resources are
created.
3) How code behaves(during its acess to router) when status transitioned from
ALLOCATING to ACTIVE
Alternatively, if we are able to create HA router's resources before HA
router creation, we can avoid a new status and new checks, but same
functionality as https://review.openstack.org/#/c/257059/.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1566194/+subscriptions
--
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help : https://help.launchpad.net/ListHelp