Public bug reported:

When we restart the neutron-l3-agent we observe that backup routers start 
accepting router advertisements. This leads to routes inside the router 
namespace which expire.
e.g.:
$ ip netns exec qrouter-a5f7fb32-3e30-4e15-89f9-4ae888c2cac6 ip -6 r
x:x:1002:1::/64 dev qr-72f85121-ce proto kernel metric 256 expires 86355sec 
pref medium
x:x:1002:1::/64 dev qr-4e84792f-aa proto kernel metric 256 expires 86355sec 
pref medium
fe80::/64 dev ha-9d085c9d-15 proto kernel metric 256 pref medium
default via fe80::f816:3eff:fed3:3fa6 dev qr-4e84792f-aa proto ra metric 1024 
expires 255sec hoplimit 64 pref medium
default via fe80::f816:3eff:fed3:3fa6 dev qr-72f85121-ce proto ra metric 1024 
expires 255sec hoplimit 64 pref medium


When we now failover to such a backup router, the kernel does not create the 
necessary directly attached routes because they already exist. The problem is 
that those routes expire and because we are now a master router the routes do 
not refresh from the router advertisement anymore and expire after 24h which 
breaks ipv6 for those routers.

After we dug a bit deeper into this issue we found that the function [1]
that disables the accept_ra on the backup routers always returns false.
So backup routers never get their router advertisement disabled.

master router: 
$ ip netns exec qrouter-92ed5c1f-c705-4ab9-a0e1-56e905d43abd sysctl 
net.ipv6.conf.qr-c7eb60ab-f1.accept_ra
net.ipv6.conf.qr-c7eb60ab-f1.accept_ra = 1

backup router:
$ ip netns exec qrouter-92ed5c1f-c705-4ab9-a0e1-56e905d43abd sysctl 
net.ipv6.conf.qr-c7eb60ab-f1.accept_ra
net.ipv6.conf.qr-c7eb60ab-f1.accept_ra = 1

[1]
https://github.com/openstack/neutron/blob/master/neutron/agent/l3/ha_router.py#L318

** Affects: neutron
     Importance: Undecided
         Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1958149

Title:
  ha backup router ipv6 accept_ra broken

Status in neutron:
  In Progress

Bug description:
  When we restart the neutron-l3-agent we observe that backup routers start 
accepting router advertisements. This leads to routes inside the router 
namespace which expire.
  e.g.:
  $ ip netns exec qrouter-a5f7fb32-3e30-4e15-89f9-4ae888c2cac6 ip -6 r
  x:x:1002:1::/64 dev qr-72f85121-ce proto kernel metric 256 expires 86355sec 
pref medium
  x:x:1002:1::/64 dev qr-4e84792f-aa proto kernel metric 256 expires 86355sec 
pref medium
  fe80::/64 dev ha-9d085c9d-15 proto kernel metric 256 pref medium
  default via fe80::f816:3eff:fed3:3fa6 dev qr-4e84792f-aa proto ra metric 1024 
expires 255sec hoplimit 64 pref medium
  default via fe80::f816:3eff:fed3:3fa6 dev qr-72f85121-ce proto ra metric 1024 
expires 255sec hoplimit 64 pref medium

  
  When we now failover to such a backup router, the kernel does not create the 
necessary directly attached routes because they already exist. The problem is 
that those routes expire and because we are now a master router the routes do 
not refresh from the router advertisement anymore and expire after 24h which 
breaks ipv6 for those routers.

  After we dug a bit deeper into this issue we found that the function
  [1] that disables the accept_ra on the backup routers always returns
  false. So backup routers never get their router advertisement
  disabled.

  master router: 
  $ ip netns exec qrouter-92ed5c1f-c705-4ab9-a0e1-56e905d43abd sysctl 
net.ipv6.conf.qr-c7eb60ab-f1.accept_ra
  net.ipv6.conf.qr-c7eb60ab-f1.accept_ra = 1

  backup router:
  $ ip netns exec qrouter-92ed5c1f-c705-4ab9-a0e1-56e905d43abd sysctl 
net.ipv6.conf.qr-c7eb60ab-f1.accept_ra
  net.ipv6.conf.qr-c7eb60ab-f1.accept_ra = 1

  [1]
  
https://github.com/openstack/neutron/blob/master/neutron/agent/l3/ha_router.py#L318

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1958149/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to