Reviewed: https://review.openstack.org/366493 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=2b148c3f9299642e0bb068983de68ec6441a23be Submitter: Jenkins Branch: master
commit 2b148c3f9299642e0bb068983de68ec6441a23be Author: He Qing <[email protected]> Date: Wed Sep 7 05:07:25 2016 +0000 Fix wrong HA router state When we add/remove router interface from HA router, l3 agent will send SIGHUP signal to keepalived for reloading configuraion. But for DVR+HA router, l3 agent will send SIGHUP signal TWICE which will cause VRRP sub-process terminated and vip addresses and routes left over. Keepalived then restart VRRP process and there will be a re-election between VRRP peers. After the election, if the former is still master, the state showed from Neutron will be correct. But if the former master transitioned to backup, the new VRRP process will NOT delete vips and routes because it is not the one who configured them. There will be two active agent showed from Neutron. HaRouter.enable_keepalived() will send SIGHUP signal to keepalived. DvrEdgeHaRouter.process() should not call enable_keepalived() by itself because it has inherited from class HaRouter. Closes-Bug: 1602320 Change-Id: I647269665a22b4becb3e326e1f4b03ddd961d6b1 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1602320 Title: ha + distributed router: keepalived process kill vrrp child process Status in neutron: Fix Released Bug description: Code Repo: mitaka keepalived version: 1.2.13 node mode: 4 nodes(containers), dvr_snat(l3 agent_mode) OS: Centos 7 I both configure router_distributed and l3_ha True. Then I create a router, using neutron l3-agent-list-hosting-router command, the result show 1 active, 3 standby. Then I add a router interface, there are more than 1 active. I trace the /var/log/messages, in the original active l3 agent node: 2016-07-12T16:33:32.083140+08:00 localhost Keepalived[1320437]: VRRP child process(1320438) died: Respawning 2016-07-12T16:33:32.083613+08:00 localhost Keepalived[1320437]: Starting VRRP child process, pid=1340135 Strace info: http://paste.openstack.org/show/530791/ This is not always failed, sometimes there was only 1 active. Maybe this is related to the environment, because I can't reproduce in VMs. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1602320/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

