Reviewed: https://review.openstack.org/326729 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=acd04d668bd414cd21f2715adc6a35a0eaed59a3 Submitter: Jenkins Branch: master
commit acd04d668bd414cd21f2715adc6a35a0eaed59a3 Author: Swaminathan Vasudevan <[email protected]> Date: Tue Jun 7 13:31:56 2016 -0700 DVR: Clean stale snat-ns by checking its existence when agent restarts At present there is no clear way to distinguish when the snat_namespace object is initialized and when the actual namespace is created. There is no way to check if the namespace already existed. The code was only checking at the snat_namespace object instead of its existence. This patch addresses the issue by adding in an exists method to the namespace object to identify the existence of the namespace in the given agent. This would allow us to check for the existence of the namespace, and also allow us to identify the stale snat namespace and delete the namespace when the gateway is cleared as the agent restarts. This also applies for conditions when the router is manually moved from one agent to another agent while the agent is dead. When the agent wakes up it would clean up the stale snat namespace. Change-Id: Icb00297208813436c2a9e9a003275462293ad643 Closes-Bug: #1557909 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1557909 Title: SNAT namespace is not getting cleared after the manual move of SNAT with dead agent Status in neutron: Fix Released Bug description: Latest patch (2016-06-10): https://review.openstack.org/#/c/326729/ Stale snat namespace on the controller after recovery of dead l3 agent. Note: Only on Stable/LIBERTY Branch: Setup: Multiple controller (DVR_SNAT) setup. Steps: 1) Create tenant network, subnet and router. 2) Create a external network 3) Attached internal & external network to a router 4) Create VM on above tenant network. 5) Make sure VM can reach outside using CSNAT. 6) Find router hosting l3 agent and stop the l3 agent. 7) Manually move router to other controller (dvr_snat mode). SNAT namespace should be create on new controller node. 8) Start the l3 agent on the controller (the one that stopped in step6) 9) Notice that snat namespace is now available on 2 controller and it is not getting deleted from the agent which is not hosting it. Example: | cfa97c12-b975-4515-86c3-9710c9b88d76 | L3 agent | vm2-ctl2-936 | :-) | True | neutron-l3-agent | | df4ca7c5-9bae-4cfb-bc83-216612b2b378 | L3 agent | vm1-ctl1-936 | :-) | True | neutron-l3-agent | mysql> select * from csnat_l3_agent_bindings; +--------------------------------------+--------------------------------------+---------+------------------+ | router_id | l3_agent_id | host_id | csnat_gw_port_id | +--------------------------------------+--------------------------------------+---------+------------------+ | 0fb68420-9e69-41bb-8a88-8ab53b0faabb | cfa97c12-b975-4515-86c3-9710c9b88d76 | NULL | NULL | +--------------------------------------+--------------------------------------+---------+------------------+ On vm1-ctl1-936 Stale SNAT namespace on Initially hosting controller. ubuntu@vm1-ctl1-936:~/devstack$ sudo ip netns snat-0fb68420-9e69-41bb-8a88-8ab53b0faabb qrouter-0fb68420-9e69-41bb-8a88-8ab53b0faabb On vm2-ctl2-936 (2nd Controller) ubuntu@vm2-ctl2-936:~$ ip netns snat-0fb68420-9e69-41bb-8a88-8ab53b0faabb qrouter-0fb68420-9e69-41bb-8a88-8ab53b0faabb To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1557909/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

