Public bug reported:

Release: Queens, ovsdb_interface=native, of_request_timeout = 30

With number of OVS ports growing on the node following errors start to
occur (starting at ~1200 ports):

ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch 
[req-db47426c-1719-43dd-8ecf-4fb4bdcbc316 - - - - -] ofctl request 
version=None,msg_type=None,msg_len=None,xid=None,OFPFlowMod(buffer_id=4294967295,command=0,cookie=5881109557449606263L,cookie_mask=0,flags=0,hard_timeout=0,idle_timeout=0,instructions=[OFPInstructionActions(actions=[OFPActionPopVlan(len=8,type=18),
 OFPActionSetField(tunnel_id=725), 
OFPActionOutput(len=16,max_len=0,port=1793,type=0), 
OFPActionOutput(len=16,max_len=0,port=2,type=0)],type=4)],match=OFPMatch(oxm_fields={'vlan_vid':
 4175}),out_group=0,out_port=0,priority=1,table_id=22) error Datapath Invalid 
64183592930369: InvalidDatapath: Datapath Invalid
 or 
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch 
[req-632b8ede-1234-4682-afe0-3aefb615b121 - - - - -] ofctl request 
version=0x4,msg_type=0xe,msg_len=0x78,xid=0x73c67c07,OFPFlow
Mod(buffer_id=4294967295,command=0,cookie=5881109557449606263L,cookie_mask=0,flags=0,hard_timeout=0,idle_timeout=0,instructions=[OFPInstructionActions(actions=[OFPActionPopVlan(len=8,type=18),
 OFPActionSetField(tunnel_id=666), OFPActionOu
tput(len=16,max_len=0,port=2,type=0)],len=48,type=4)],match=OFPMatch(oxm_fields={'eth_dst':
 'fa:16:3e:4a:79:ce', 'vlan_vid': 
6107}),out_group=0,out_port=0,priority=2,table_id=20) timed out: Timeout: 30 
seconds

with corresponding errors is ovs-vswitchd logs:

|rconn|ERR|br-tun<->tcp:127.0.0.1:6633: no response to inactivity probe after 5 
seconds, disconnecting
|rconn|ERR|br-floating<->tcp:127.0.0.1:6633: no response to inactivity probe 
after 5 seconds, disconnecting
|rconn|ERR|br-int<->tcp:127.0.0.1:6633: no response to inactivity probe after 5 
seconds, disconnecting


Setting inactivity_probe to a greater value helps:

#ovs-vsctl set controller br-int inactivity_probe=30000
#ovs-vsctl set controller br-tun inactivity_probe=30000
#ovs-vsctl set controller br-floating inactivity_probe=30000

Should neutron allow setting inactivity_probe for controllers?
Should it correspond to of_request_timeout value?

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1821753

Title:
  openvswitch agent ofctl request errors: 'timed out' and 'Datapath
  Invalid'

Status in neutron:
  New

Bug description:
  Release: Queens, ovsdb_interface=native, of_request_timeout = 30

  With number of OVS ports growing on the node following errors start to
  occur (starting at ~1200 ports):

  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch 
[req-db47426c-1719-43dd-8ecf-4fb4bdcbc316 - - - - -] ofctl request 
version=None,msg_type=None,msg_len=None,xid=None,OFPFlowMod(buffer_id=4294967295,command=0,cookie=5881109557449606263L,cookie_mask=0,flags=0,hard_timeout=0,idle_timeout=0,instructions=[OFPInstructionActions(actions=[OFPActionPopVlan(len=8,type=18),
 OFPActionSetField(tunnel_id=725), 
OFPActionOutput(len=16,max_len=0,port=1793,type=0), 
OFPActionOutput(len=16,max_len=0,port=2,type=0)],type=4)],match=OFPMatch(oxm_fields={'vlan_vid':
 4175}),out_group=0,out_port=0,priority=1,table_id=22) error Datapath Invalid 
64183592930369: InvalidDatapath: Datapath Invalid
   or 
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch 
[req-632b8ede-1234-4682-afe0-3aefb615b121 - - - - -] ofctl request 
version=0x4,msg_type=0xe,msg_len=0x78,xid=0x73c67c07,OFPFlow
  
Mod(buffer_id=4294967295,command=0,cookie=5881109557449606263L,cookie_mask=0,flags=0,hard_timeout=0,idle_timeout=0,instructions=[OFPInstructionActions(actions=[OFPActionPopVlan(len=8,type=18),
 OFPActionSetField(tunnel_id=666), OFPActionOu
  
tput(len=16,max_len=0,port=2,type=0)],len=48,type=4)],match=OFPMatch(oxm_fields={'eth_dst':
 'fa:16:3e:4a:79:ce', 'vlan_vid': 
6107}),out_group=0,out_port=0,priority=2,table_id=20) timed out: Timeout: 30 
seconds

  with corresponding errors is ovs-vswitchd logs:

  |rconn|ERR|br-tun<->tcp:127.0.0.1:6633: no response to inactivity probe after 
5 seconds, disconnecting
  |rconn|ERR|br-floating<->tcp:127.0.0.1:6633: no response to inactivity probe 
after 5 seconds, disconnecting
  |rconn|ERR|br-int<->tcp:127.0.0.1:6633: no response to inactivity probe after 
5 seconds, disconnecting

  
  Setting inactivity_probe to a greater value helps:

  #ovs-vsctl set controller br-int inactivity_probe=30000
  #ovs-vsctl set controller br-tun inactivity_probe=30000
  #ovs-vsctl set controller br-floating inactivity_probe=30000

  Should neutron allow setting inactivity_probe for controllers?
  Should it correspond to of_request_timeout value?

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1821753/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to