Public bug reported: An openstack environment was built using Openstack-Ansible (OSA) on Mitaka with the neutron_l3_agent in HA mode. This was functioning correctly using network namespaces for routers. Within the namespace keeplived created an 'ha' virtual interface to track the status of the other instance of the virtual router. This worked correctly, the 'ha' virtual interface within 'master' router namespace could ping the 'ha' virtual interface within the 'backup' router namespace, and when the master went offline keepalived would successfully transition to master and bring up the virtual IP addresses with then network namespace virtual router.
We upgraded the environment to newton via the guide at http://docs.openstack.org/developer/openstack-ansible/newton/upgrade- guide/manual-upgrade.html. After this was done the network namespace virtual routers (specifically the 'ha' track interfaaces) were no longer able to communicate with each other, resulting in them both transitioning to 'master' and bringing up duplicate IP addresses. This caused intermittent connectivity to public floating IPs and also from the routers to instances over VXLAN network. ******** l3_agent.ini configuration ******** # General [DEFAULT] verbose = True debug = False # While this option is deprecated in Liberty, if we remove it then it takes # a default value of 'br-ex', which we do not want. We therefore leave it # in place for now and can remove it in Mitaka. external_network_bridge = gateway_external_network_id = use_namespaces = True router_delete_namespaces = True # Drivers interface_driver = neutron.agent.linux.interface.BridgeInterfaceDriver # Agent mode (legacy only) agent_mode = legacy # Conventional failover allow_automatic_l3agent_failover = True # HA failover ha_confs_path = /var/lib/neutron/ha_confs ha_vrrp_advert_int = 2 ha_vrrp_auth_password = bee916a2589b14dd7f ha_vrrp_auth_type = PASS handle_internal_only_routers = False send_arp_for_ha = 3 # Metadata enable_metadata_proxy = True ******** keepalived.conf configuration ******** vrrp_instance VR_1 { state BACKUP interface ha-42c56d27-10 virtual_router_id 1 priority 50 garp_master_delay 60 nopreempt advert_int 2 authentication { auth_type PASS auth_pass bee916a2589b14dd7f } track_interface { ha-42c56d27-10 } virtual_ipaddress { 169.254.0.1/24 dev ha-42c56d27-10 } virtual_ipaddress_excluded { 10.0.0.1/8 dev qr-8deaf807-bb xx.xx.xx.xx/22 dev qg-6e4ebe51-94 xx.xx.xx.xx/32 dev qg-6e4ebe51-94 xxxx::xxxx:xxxx:xxxx:xxxx/64 dev qg-6e4ebe51-94 scope link xxxx::xxxx:xxxx:xxxx:xxxx/64 dev qr-8deaf807-bb scope link } virtual_routes { 0.0.0.0/0 via xx.xx.xx.xx dev qg-6e4ebe51-94 } } ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1648823 Title: l3 agent HA communication failure Status in neutron: New Bug description: An openstack environment was built using Openstack-Ansible (OSA) on Mitaka with the neutron_l3_agent in HA mode. This was functioning correctly using network namespaces for routers. Within the namespace keeplived created an 'ha' virtual interface to track the status of the other instance of the virtual router. This worked correctly, the 'ha' virtual interface within 'master' router namespace could ping the 'ha' virtual interface within the 'backup' router namespace, and when the master went offline keepalived would successfully transition to master and bring up the virtual IP addresses with then network namespace virtual router. We upgraded the environment to newton via the guide at http://docs.openstack.org/developer/openstack-ansible/newton/upgrade- guide/manual-upgrade.html. After this was done the network namespace virtual routers (specifically the 'ha' track interfaaces) were no longer able to communicate with each other, resulting in them both transitioning to 'master' and bringing up duplicate IP addresses. This caused intermittent connectivity to public floating IPs and also from the routers to instances over VXLAN network. ******** l3_agent.ini configuration ******** # General [DEFAULT] verbose = True debug = False # While this option is deprecated in Liberty, if we remove it then it takes # a default value of 'br-ex', which we do not want. We therefore leave it # in place for now and can remove it in Mitaka. external_network_bridge = gateway_external_network_id = use_namespaces = True router_delete_namespaces = True # Drivers interface_driver = neutron.agent.linux.interface.BridgeInterfaceDriver # Agent mode (legacy only) agent_mode = legacy # Conventional failover allow_automatic_l3agent_failover = True # HA failover ha_confs_path = /var/lib/neutron/ha_confs ha_vrrp_advert_int = 2 ha_vrrp_auth_password = bee916a2589b14dd7f ha_vrrp_auth_type = PASS handle_internal_only_routers = False send_arp_for_ha = 3 # Metadata enable_metadata_proxy = True ******** keepalived.conf configuration ******** vrrp_instance VR_1 { state BACKUP interface ha-42c56d27-10 virtual_router_id 1 priority 50 garp_master_delay 60 nopreempt advert_int 2 authentication { auth_type PASS auth_pass bee916a2589b14dd7f } track_interface { ha-42c56d27-10 } virtual_ipaddress { 169.254.0.1/24 dev ha-42c56d27-10 } virtual_ipaddress_excluded { 10.0.0.1/8 dev qr-8deaf807-bb xx.xx.xx.xx/22 dev qg-6e4ebe51-94 xx.xx.xx.xx/32 dev qg-6e4ebe51-94 xxxx::xxxx:xxxx:xxxx:xxxx/64 dev qg-6e4ebe51-94 scope link xxxx::xxxx:xxxx:xxxx:xxxx/64 dev qr-8deaf807-bb scope link } virtual_routes { 0.0.0.0/0 via xx.xx.xx.xx dev qg-6e4ebe51-94 } } To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1648823/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

