** Changed in: neutron
       Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1636466

Title:
  HA router interface points to wrong host after network disruption

Status in neutron:
  Won't Fix

Bug description:
  If overlay network of a network node is down for a while, the slave node of 
HA router can't receive the VRRP packet, so it will promote itself as the 
master node. Then L3 agent updates ha_state of the router bound with itself to 
active and updates port bindings of the router interfaces to the associated 
host.
  After network recovery, one of the two master nodes of a HA router will be 
degraded to the slave node. If the degraded node is exactly the previous slave 
node, L3 agent updates ha_state of the router bound with itself to standby but 
won't update port bindings of the router interfaces to the host hosting the 
original master node. Then packets sent to the router are sent to the slave 
node because l2pop uses the incorrect port bindings.
  As the keepalived configuration priority are the same 50, the probability of 
occurrence of the above problem in two network node scenario is 50%.

  How to reproduce:
  - two network nodes: host1, host2.

  - create a ha router: router1, a network: network1 and a subnet: subnet1, add 
interface of subnet1 to router1.
  $ neutron l3-agent-list-hosting-router subnet1
  
+--------------------------------------+------------+----------------+-------+----------+
  | id                                   | host       | admin_state_up | alive 
| ha_state |
  
+--------------------------------------+------------+----------------+-------+----------+
  | 3a3b8d27-e5b4-42c0-9433-2ba8b6be98c2 | host1      | True           | :-)   
| standby  |
  | 4eba4a33-1452-4f4e-8874-a8eff2f4f357 | host2      | True           | :-)   
| active   |
  
+--------------------------------------+------------+----------------+-------+----------+
  $ neutron router-port-list subnet1 -c id -c binding:host_id -c fixed_ips
  
+--------------------------------------+-----------------+--------------------------------------------------------------------------------------+
  | id                                   | binding:host_id | fixed_ips          
                                                                  |
  
+--------------------------------------+-----------------+--------------------------------------------------------------------------------------+
  | 00a89bc5-a589-4c37-9db0-a7b439c4dee9 | host1           | {"subnet_id": 
"6bb7aced-6b8f-448f-813d-d1bc91b9ee2d", "ip_address": "169.254.192.6"} |
  | b83590b2-0bf9-4fe7-b29f-0d37c92a9b3a | host2           | {"subnet_id": 
"75e30064-a625-4267-8cbf-20d1a7b6e952", "ip_address": "192.168.10.1"}  |
  | ca2a66e0-5525-4302-b00f-0e703dbb48e2 | host2           | {"subnet_id": 
"6bb7aced-6b8f-448f-813d-d1bc91b9ee2d", "ip_address": "169.254.192.1"} |
  
+--------------------------------------+-----------------+--------------------------------------------------------------------------------------+

  - disconnect host1 from the overlay network, wait until the 
l3-agent-list-hosting-router api show that the two ha_state of router1 are both 
active.
  $ neutron l3-agent-list-hosting-router subnet1
  
+--------------------------------------+------------+----------------+-------+----------+
  | id                                   | host       | admin_state_up | alive 
| ha_state |
  
+--------------------------------------+------------+----------------+-------+----------+
  | 3a3b8d27-e5b4-42c0-9433-2ba8b6be98c2 | host1      | True           | :-)   
| active   |
  | 4eba4a33-1452-4f4e-8874-a8eff2f4f357 | host2      | True           | :-)   
| active   |
  
+--------------------------------------+------------+----------------+-------+----------+
  $ neutron router-port-list subnet1 -c id -c binding:host_id -c fixed_ips
  
+--------------------------------------+-----------------+--------------------------------------------------------------------------------------+
  | id                                   | binding:host_id | fixed_ips          
                                                                  |
  
+--------------------------------------+-----------------+--------------------------------------------------------------------------------------+
  | 00a89bc5-a589-4c37-9db0-a7b439c4dee9 | host1           | {"subnet_id": 
"6bb7aced-6b8f-448f-813d-d1bc91b9ee2d", "ip_address": "169.254.192.6"} |
  | b83590b2-0bf9-4fe7-b29f-0d37c92a9b3a | host1           | {"subnet_id": 
"75e30064-a625-4267-8cbf-20d1a7b6e952", "ip_address": "192.168.10.1"}  |
  | ca2a66e0-5525-4302-b00f-0e703dbb48e2 | host2           | {"subnet_id": 
"6bb7aced-6b8f-448f-813d-d1bc91b9ee2d", "ip_address": "169.254.192.1"} |
  
+--------------------------------------+-----------------+--------------------------------------------------------------------------------------+

  - restore the overlay network of host1, wait until one ha_state of router1 
turn to standby. There is a 50% probability that the port binding of the 
interface of router1 would be inconsistent with the host hosting the active 
node. Then instances in subnet1 can't reach the router interface.
  $ neutron l3-agent-list-hosting-router subnet1
  
+--------------------------------------+------------+----------------+-------+----------+
  | id                                   | host       | admin_state_up | alive 
| ha_state |
  
+--------------------------------------+------------+----------------+-------+----------+
  | 3a3b8d27-e5b4-42c0-9433-2ba8b6be98c2 | host1      | True           | :-)   
| standby  |
  | 4eba4a33-1452-4f4e-8874-a8eff2f4f357 | host2      | True           | :-)   
| active   |
  
+--------------------------------------+------------+----------------+-------+----------+
  $ neutron router-port-list subnet1 -c id -c binding:host_id -c fixed_ips
  
+--------------------------------------+-----------------+--------------------------------------------------------------------------------------+
  | id                                   | binding:host_id | fixed_ips          
                                                                  |
  
+--------------------------------------+-----------------+--------------------------------------------------------------------------------------+
  | 00a89bc5-a589-4c37-9db0-a7b439c4dee9 | host1           | {"subnet_id": 
"6bb7aced-6b8f-448f-813d-d1bc91b9ee2d", "ip_address": "169.254.192.6"} |
  | b83590b2-0bf9-4fe7-b29f-0d37c92a9b3a | host1           | {"subnet_id": 
"75e30064-a625-4267-8cbf-20d1a7b6e952", "ip_address": "192.168.10.1"}  |
  | ca2a66e0-5525-4302-b00f-0e703dbb48e2 | host2           | {"subnet_id": 
"6bb7aced-6b8f-448f-813d-d1bc91b9ee2d", "ip_address": "169.254.192.1"} |
  
+--------------------------------------+-----------------+--------------------------------------------------------------------------------------+

  Expected behavior:
  - update ha_state of a HA router to standby should trigger to update port 
binding of the router interfaces to the host whose ha_state is active.

  Affected versions:
  can be reproduced in master branch, guess mitaka and newton are also affected.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1636466/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to