[Expired for neutron because there has been no activity for 60 days.]

** Changed in: neutron
       Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1398988

Title:
  dhcp-agent / network bindings out of sync, stopping dhcp-agent

Status in neutron:
  Expired

Bug description:
  dhcp-agent on a neutron host started dying with

  ---
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent Traceback (most 
recent call last):
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent 
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent   File 
"/usr/lib/python2.7/site-packages/neutron/openstack/common/rpc/amqp.py", line 
462, in _process_data
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent     **args)
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent 
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent   File 
"/usr/lib/python2.7/site-packages/neutron/common/rpc.py", line 45, in dispatch
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent     
neutron_ctxt, version, method, namespace, **kwargs)
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent 
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent   File 
"/usr/lib/python2.7/site-packages/neutron/openstack/common/rpc/dispatcher.py", 
line 172, in dispatch
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent     result = 
getattr(proxyobj, method)(ctxt, **kwargs)
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent 
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent   File 
"/usr/lib/python2.7/site-packages/neutron/db/dhcp_rpc_base.py", line 92, in 
get_active_networks_info
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent     networks = 
self._get_active_networks(context, **kwargs)
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent 
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent   File 
"/usr/lib/python2.7/site-packages/neutron/db/dhcp_rpc_base.py", line 42, in 
_get_active_networks
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent     
plugin.auto_schedule_networks(context, host)
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent 
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent   File 
"/usr/lib/python2.7/site-packages/neutron/db/agentschedulers_db.py", line 222, 
in auto_schedule_networks
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent     
self.network_scheduler.auto_schedule_networks(self, context, host)
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent 
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent   File 
"/usr/lib/python2.7/site-packages/neutron/scheduler/dhcp_agent_scheduler.py", 
line 122, in auto_schedule_networks
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent     context, 
[net_id], active=True)
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent 
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent   File 
"/usr/lib/python2.7/site-packages/neutron/db/agentschedulers_db.py", line 126, 
in get_dhcp_agents_hosting_networks
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent     
binding.dhcp_agent)]
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent 
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent   File 
"/usr/lib/python2.7/site-packages/neutron/db/agentschedulers_db.py", line 83, 
in is_eligible_agent
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent     
agent['heartbeat_timestamp'])
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent 
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent TypeError: 
'NoneType' object has no attribute '__getitem__'
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent 
  2014-12-02 20:52:20.739 17097 TRACE neutron.agent.dhcp_agent 
  ---

  further investigation on the neutron server found that
  networkdhcpbindings seems to have got out of sync

  we can see just one dhcp-agent

  ---
  MariaDB [neutron]> select id from agents where agent_type="DHCP agent";
  +--------------------------------------+
  | id                                   |
  +--------------------------------------+
  | 6923675d-5616-4ffe-b2c4-4d130f67973f |
  +--------------------------------------+
  1 row in set (0.00 sec)
  ---

  but in the network bindings, at least 3 are listed

  ----
  MariaDB [neutron]> select DISTINCT(dhcp_agent_id) from 
networkdhcpagentbindings;
  +--------------------------------------+
  | dhcp_agent_id                        |
  +--------------------------------------+
  | 6923675d-5616-4ffe-b2c4-4d130f67973f |
  | b23f9f97-da04-4f61-bcfb-f8514e43cefd |
  | d3e3ac5b-9962-428a-a9f8-6b2a1aba48d8 |
  +--------------------------------------+
  3 rows in set (0.00 sec)
  ---

  neutron/db/agentschedulers_db.py:get_dhcp_agents_hosting_networks()
  [1] doesn't expect this case of a binding with no associated agent, so
  it goes off an tries to find the latest heartbeat for the agent, which
  leads to the original traceback.

  I unfortunately have no idea how this got out of sync; the host has
  been upgraded from juno->kilo and neutron and dhcp-agents have been
  restarted many, many times for various reasons.

  I found two other references [2],[3]

  [1] 
http://git.openstack.org/cgit/openstack/neutron/tree/neutron/db/agentschedulers_db.py#n107
  [2] http://www.gossamer-threads.com/lists/openstack/dev/42060
  [3] http://lists.openstack.org/pipermail/openstack/2013-October/002195.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1398988/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to