Public bug reported:

Pike
DVR + L3_HA
L2population enabled

Some of our L3 HA routers are not working correctly. They are not reachable 
from instances.
After deep investigation, I've found that "HA port tenant <tenant id>" ports 
are in state DOWN.
They are DOWN because they don't have binding information.
They don't have binding information because 'HA network tenant <tenant_id>' 
network is corrupted.

I mean it does not have provider:network_type and
provider:segmentation_id parameters set.

The weird thing is that this network was OK and worked but in some point
in time has been corrupted. I don't have any logs from this point in
time.

For comparison working HA tenant network:

+---------------------------+----------------------------------------------------+
| Field                     | Value                                             
 |
+---------------------------+----------------------------------------------------+
| admin_state_up            | True                                              
 |
| availability_zone_hints   |                                                   
 |
| availability_zones        | nova                                              
 |
| created_at                | 2018-02-16T16:52:31Z                              
 |
| description               |                                                   
 |
| id                        | fa2fea5c-ccaa-4116-bb0c-ff59bbd8229a              
 |
| ipv4_address_scope        |                                                   
 |
| ipv6_address_scope        |                                                   
 |
| mtu                       | 9000                                              
 |
| name                      | HA network tenant 
afeeb372d7934795b63868330eca0dfe |
| port_security_enabled     | True                                              
 |
| project_id                |                                                   
 |
| provider:network_type     | vxlan                                             
 |
| provider:physical_network |                                                   
 |
| provider:segmentation_id  | 35                                                
 |
| revision_number           | 3                                                 
 |
| router:external           | False                                             
 |
| shared                    | False                                             
 |
| status                    | ACTIVE                                            
 |
| subnets                   | 5cbc612d-13cf-4889-88fb-02d1debe5f8d              
 |
| tags                      |                                                   
 |
| tenant_id                 |                                                   
 |
| updated_at                | 2018-02-16T16:52:31Z                              
 |
+---------------------------+----------------------------------------------------+

and not working HA tenant network:

+---------------------------+----------------------------------------------------+
| Field                     | Value                                             
 |
+---------------------------+----------------------------------------------------+
| admin_state_up            | True                                              
 |
| availability_zone_hints   |                                                   
 |
| availability_zones        |                                                   
 |
| created_at                | 2018-01-26T12:24:15Z                              
 |
| description               |                                                   
 |
| id                        | 6390c381-871e-4945-bfa0-00828bb519bc              
 |
| ipv4_address_scope        |                                                   
 |
| ipv6_address_scope        |                                                   
 |
| mtu                       | 9000                                              
 |
| name                      | HA network tenant 
3e88cffb9dbb4e1fba96ee72a02e012e |
| port_security_enabled     | True                                              
 |
| project_id                |                                                   
 |
| provider:network_type     |                                                   
 |
| provider:physical_network |                                                   
 |
| provider:segmentation_id  |                                                   
 |
| revision_number           | 5                                                 
 |
| router:external           | False                                             
 |
| shared                    | False                                             
 |
| status                    | ACTIVE                                            
 |
| subnets                   | 4d579b00-c780-45ed-9bd8-4d3256fa8a42              
 |
| tags                      |                                                   
 |
| tenant_id                 |                                                   
 |
| updated_at                | 2018-01-29T14:08:11Z                              
 |
+---------------------------+----------------------------------------------------+


I've found that all working networks have revision_number = 3 and all not 
working networks have revision_number = 5.

When HA network tenant network is corrupted ALL L3-HA routers in a particular 
tenant are not working.
Is there any way to fix this without removing all existing L3-HA routers in 
this tenant?

Unfortunately I can't find any code responsible for "HA network tenant"
updating or modification so I hit a wall in my debugging process.

It is probable that network has been corrupted during some automatic
network resources provisioning using Heat stack but I can't reproduce
this.

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1757188

Title:
  some L3 HA routrers does not work

Status in neutron:
  New

Bug description:
  Pike
  DVR + L3_HA
  L2population enabled

  Some of our L3 HA routers are not working correctly. They are not reachable 
from instances.
  After deep investigation, I've found that "HA port tenant <tenant id>" ports 
are in state DOWN.
  They are DOWN because they don't have binding information.
  They don't have binding information because 'HA network tenant <tenant_id>' 
network is corrupted.

  I mean it does not have provider:network_type and
  provider:segmentation_id parameters set.

  The weird thing is that this network was OK and worked but in some
  point in time has been corrupted. I don't have any logs from this
  point in time.

  For comparison working HA tenant network:

  
+---------------------------+----------------------------------------------------+
  | Field                     | Value                                           
   |
  
+---------------------------+----------------------------------------------------+
  | admin_state_up            | True                                            
   |
  | availability_zone_hints   |                                                 
   |
  | availability_zones        | nova                                            
   |
  | created_at                | 2018-02-16T16:52:31Z                            
   |
  | description               |                                                 
   |
  | id                        | fa2fea5c-ccaa-4116-bb0c-ff59bbd8229a            
   |
  | ipv4_address_scope        |                                                 
   |
  | ipv6_address_scope        |                                                 
   |
  | mtu                       | 9000                                            
   |
  | name                      | HA network tenant 
afeeb372d7934795b63868330eca0dfe |
  | port_security_enabled     | True                                            
   |
  | project_id                |                                                 
   |
  | provider:network_type     | vxlan                                           
   |
  | provider:physical_network |                                                 
   |
  | provider:segmentation_id  | 35                                              
   |
  | revision_number           | 3                                               
   |
  | router:external           | False                                           
   |
  | shared                    | False                                           
   |
  | status                    | ACTIVE                                          
   |
  | subnets                   | 5cbc612d-13cf-4889-88fb-02d1debe5f8d            
   |
  | tags                      |                                                 
   |
  | tenant_id                 |                                                 
   |
  | updated_at                | 2018-02-16T16:52:31Z                            
   |
  
+---------------------------+----------------------------------------------------+

  and not working HA tenant network:

  
+---------------------------+----------------------------------------------------+
  | Field                     | Value                                           
   |
  
+---------------------------+----------------------------------------------------+
  | admin_state_up            | True                                            
   |
  | availability_zone_hints   |                                                 
   |
  | availability_zones        |                                                 
   |
  | created_at                | 2018-01-26T12:24:15Z                            
   |
  | description               |                                                 
   |
  | id                        | 6390c381-871e-4945-bfa0-00828bb519bc            
   |
  | ipv4_address_scope        |                                                 
   |
  | ipv6_address_scope        |                                                 
   |
  | mtu                       | 9000                                            
   |
  | name                      | HA network tenant 
3e88cffb9dbb4e1fba96ee72a02e012e |
  | port_security_enabled     | True                                            
   |
  | project_id                |                                                 
   |
  | provider:network_type     |                                                 
   |
  | provider:physical_network |                                                 
   |
  | provider:segmentation_id  |                                                 
   |
  | revision_number           | 5                                               
   |
  | router:external           | False                                           
   |
  | shared                    | False                                           
   |
  | status                    | ACTIVE                                          
   |
  | subnets                   | 4d579b00-c780-45ed-9bd8-4d3256fa8a42            
   |
  | tags                      |                                                 
   |
  | tenant_id                 |                                                 
   |
  | updated_at                | 2018-01-29T14:08:11Z                            
   |
  
+---------------------------+----------------------------------------------------+

  
  I've found that all working networks have revision_number = 3 and all not 
working networks have revision_number = 5.

  When HA network tenant network is corrupted ALL L3-HA routers in a particular 
tenant are not working.
  Is there any way to fix this without removing all existing L3-HA routers in 
this tenant?

  Unfortunately I can't find any code responsible for "HA network
  tenant" updating or modification so I hit a wall in my debugging
  process.

  It is probable that network has been corrupted during some automatic
  network resources provisioning using Heat stack but I can't reproduce
  this.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1757188/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to