Public bug reported:
Since upgrading from Rocky to Stein we are experiencing problems live
migrating vm's with trunk ports and creating new trunk ports. The live
migrations of the vm itself eventually completes but the trunk ports
remain in the status "BUILD" or "DOWN". The corresponding subports
and/or the parent port are mostly in status "DOWN" too. It looks like
not all of the corresponding needed ports get moved from hypervisor host
a to host b. Given theses status from the ports it is obvious that the
VM is not accessible from the network at all.
Most of the time when the migration is about to finish we see such kind
of time out messages in the neutron-openvswitch-agent log:
2019-10-14 12:28:56.559 20071 ERROR neutron_lib.rpc [-] Timeout in RPC method
trunk.update_subport_bindings. Waiting for 114 seconds before next attempt. If
the server is not down, consider increasing the rpc_response_timeout option as
Neutron server(s) may be overloaded and unable to respond quickly enough.:
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to
message ID 58a64b2c975143a4bbfd07ab3b10e871
2019-10-14 12:28:56.560 20071 WARNING neutron_lib.rpc [-] Increasing timeout
for trunk.update_subport_bindings calls to 240 seconds. Restart the agent to
restore it to the default value.: oslo_messaging.exceptions.MessagingTimeout:
Timed out waiting for a reply to message ID 58a64b2c975143a4bbfd07ab3b10e871
2019-10-14 12:28:56.562 20071 ERROR neutron_lib.rpc [-] Timeout in RPC method
trunk.update_subport_bindings. Waiting for 56 seconds before next attempt. If
the server is not down, consider increasing the rpc_response_timeout option as
Neutron server(s) may be overloaded and unable to respond quickly enough.:
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to
message ID c1e5f0b50f044c8ea1f40f3e2e959fc0
2019-10-14 12:29:53.021 20071 ERROR
neutron.services.trunk.drivers.openvswitch.agent.ovsdb_handler [-] Got
messaging error while processing trunk bridge tbr-e4685a7d-2: Timed out waiting
for a reply to message ID c1e5f0b50f044c8ea1f40f3e2e959fc0:
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to
message ID c1e5f0b50f044c8ea1f40f3e2e959fc0
2019-10-14 12:30:24.896 20071 ERROR neutron_lib.rpc
[req-85c86e08-52a3-4199-a1af-915f4847e9fc cd9715e9b4714bc6b4d77f15f12ba5a9
1e205eb2989a4beb9ef5947abff00b35 - - -] Timeout in RPC method
trunk.update_trunk_status. Waiting for 75 seconds before next attempt. If the
server is not down, consider increasing the rpc_response_timeout option as
Neutron server(s) may be overloaded and unable to respond quickly enough.:
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to
message ID 8093a1090d47426380434e65559875e5
2019-10-14 12:30:24.896 20071 WARNING neutron_lib.rpc
[req-85c86e08-52a3-4199-a1af-915f4847e9fc cd9715e9b4714bc6b4d77f15f12ba5a9
1e205eb2989a4beb9ef5947abff00b35 - - -] Increasing timeout for
trunk.update_trunk_status calls to 240 seconds. Restart the agent to restore it
to the default value.: oslo_messaging.exceptions.MessagingTimeout: Timed out
waiting for a reply to message ID 8093a1090d47426380434e65559875e5
2019-10-14 12:30:50.133 20071 ERROR
neutron.services.trunk.drivers.openvswitch.agent.ovsdb_handler [-] Got
messaging error while processing trunk bridge tbr-b56178af-8: Timed out waiting
for a reply to message ID 58a64b2c975143a4bbfd07ab3b10e871:
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to
message ID 58a64b2c975143a4bbfd07ab3b10e871
2019-10-14 12:31:39.851 20071 ERROR
neutron.services.trunk.drivers.openvswitch.agent.driver
[req-85c86e08-52a3-4199-a1af-915f4847e9fc cd9715e9b4714bc6b4d77f15f12ba5a9
1e205eb2989a4beb9ef5947abff00b35 - - -] Error on event deleted for subports
[SubPort(port_id=c048169f-a005-44a3-88e3-03a34d778bb5,segmentation_id=843,segmentation_type='vlan',trunk_id=b56178af-8d6f-4660-ac3b-cc469c3de4ce)]:
Timed out waiting for a reply to message ID 8093a1090d47426380434e65559875e5:
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to
message ID 8093a1090d47426380434e65559875e5
2019-10-14 12:35:26.906 20071 ERROR neutron_lib.rpc
[req-e7ac3037-3598-4003-90db-d59985cf5326 cd9715e9b4714bc6b4d77f15f12ba5a9
1e205eb2989a4beb9ef5947abff00b35 - - -] Timeout in RPC method
trunk.update_subport_bindings. Waiting for 53 seconds before next attempt. If
the server is not down, consider increasing the rpc_response_timeout option as
Neutron server(s) may be overloaded and unable to respond quickly enough.:
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to
message ID b16fcf72d3284439a06f8383a6d04566
2019-10-14 12:35:26.907 20071 WARNING neutron_lib.rpc
[req-e7ac3037-3598-4003-90db-d59985cf5326 cd9715e9b4714bc6b4d77f15f12ba5a9
1e205eb2989a4beb9ef5947abff00b35 - - -] Increasing timeout for
trunk.update_subport_bindings calls to 480 seconds. Restart the agent to
restore it to the default value.: oslo_messaging.exceptions.MessagingTimeout:
Timed out waiting for a reply to message ID b16fcf72d3284439a06f8383a6d04566
2019-10-14 12:36:20.366 20071 ERROR
neutron.services.trunk.drivers.openvswitch.agent.driver
[req-e7ac3037-3598-4003-90db-d59985cf5326 cd9715e9b4714bc6b4d77f15f12ba5a9
1e205eb2989a4beb9ef5947abff00b35 - - -] Error on event created for subports
[SubPort(port_id=c048169f-a005-44a3-88e3-03a34d778bb5,segmentation_id=843,segmentation_type='vlan',trunk_id=b56178af-8d6f-4660-ac3b-cc469c3de4ce)]:
Timed out waiting for a reply to message ID b16fcf72d3284439a06f8383a6d04566:
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to
message ID b16fcf72d3284439a06f8383a
the os/neutron setup we have here is the following:
- 3 Controller Nodes behind HaProxy
- Ubuntu 18.04 Installation with Ubuntu Cloud Archive Repositories (Stein)
(Python 3)
- Neutron ML2 Plugin with OVS Setup
- Provider Networks
- Package Version neutron-common: 2:14.0.2-0ubuntu1~cloud0
- Package Version neutron-plugin-ml2: 2:14.0.2-0ubuntu1~cloud0
- Package Version neutron-server: 2:14.0.2-0ubuntu1~cloud0
- Package Version neutron-openvswitch-agent: 2:14.0.2-0ubuntu1~cloud0
- Package Version neutron-dhcp-agent: 2:14.0.2-0ubuntu1~cloud0
- Package Version openvswitch-common: 2.11.0-0ubuntu2~cloud0
- Package Version openvswitch-switch: 2.11.0-0ubuntu2~cloud0
the port/trunk setup is as followed:
- trunk port belonging to project p1
- parent port belonging to project p1, subnet s1
- subnet s1 belongs to project p1, network n1
- network n1 belongs to project admin and has provider:segmentation_id = 700
- subport belonging to project p1, subnet s2
- subnet s2 belongs to project p1, network n2
- network n2 belongs to project p1, and has provider:segmentation_id = 843
** Affects: neutron
Importance: Undecided
Status: New
** Summary changed:
- trunk + subports not working after live migration
+ trunk + subports not working
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1848311
Title:
trunk + subports not working
Status in neutron:
New
Bug description:
Since upgrading from Rocky to Stein we are experiencing problems live
migrating vm's with trunk ports and creating new trunk ports. The live
migrations of the vm itself eventually completes but the trunk ports
remain in the status "BUILD" or "DOWN". The corresponding subports
and/or the parent port are mostly in status "DOWN" too. It looks like
not all of the corresponding needed ports get moved from hypervisor
host a to host b. Given theses status from the ports it is obvious
that the VM is not accessible from the network at all.
Most of the time when the migration is about to finish we see such
kind of time out messages in the neutron-openvswitch-agent log:
2019-10-14 12:28:56.559 20071 ERROR neutron_lib.rpc [-] Timeout in RPC method
trunk.update_subport_bindings. Waiting for 114 seconds before next attempt. If
the server is not down, consider increasing the rpc_response_timeout option as
Neutron server(s) may be overloaded and unable to respond quickly enough.:
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to
message ID 58a64b2c975143a4bbfd07ab3b10e871
2019-10-14 12:28:56.560 20071 WARNING neutron_lib.rpc [-] Increasing timeout
for trunk.update_subport_bindings calls to 240 seconds. Restart the agent to
restore it to the default value.: oslo_messaging.exceptions.MessagingTimeout:
Timed out waiting for a reply to message ID 58a64b2c975143a4bbfd07ab3b10e871
2019-10-14 12:28:56.562 20071 ERROR neutron_lib.rpc [-] Timeout in RPC method
trunk.update_subport_bindings. Waiting for 56 seconds before next attempt. If
the server is not down, consider increasing the rpc_response_timeout option as
Neutron server(s) may be overloaded and unable to respond quickly enough.:
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to
message ID c1e5f0b50f044c8ea1f40f3e2e959fc0
2019-10-14 12:29:53.021 20071 ERROR
neutron.services.trunk.drivers.openvswitch.agent.ovsdb_handler [-] Got
messaging error while processing trunk bridge tbr-e4685a7d-2: Timed out waiting
for a reply to message ID c1e5f0b50f044c8ea1f40f3e2e959fc0:
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to
message ID c1e5f0b50f044c8ea1f40f3e2e959fc0
2019-10-14 12:30:24.896 20071 ERROR neutron_lib.rpc
[req-85c86e08-52a3-4199-a1af-915f4847e9fc cd9715e9b4714bc6b4d77f15f12ba5a9
1e205eb2989a4beb9ef5947abff00b35 - - -] Timeout in RPC method
trunk.update_trunk_status. Waiting for 75 seconds before next attempt. If the
server is not down, consider increasing the rpc_response_timeout option as
Neutron server(s) may be overloaded and unable to respond quickly enough.:
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to
message ID 8093a1090d47426380434e65559875e5
2019-10-14 12:30:24.896 20071 WARNING neutron_lib.rpc
[req-85c86e08-52a3-4199-a1af-915f4847e9fc cd9715e9b4714bc6b4d77f15f12ba5a9
1e205eb2989a4beb9ef5947abff00b35 - - -] Increasing timeout for
trunk.update_trunk_status calls to 240 seconds. Restart the agent to restore it
to the default value.: oslo_messaging.exceptions.MessagingTimeout: Timed out
waiting for a reply to message ID 8093a1090d47426380434e65559875e5
2019-10-14 12:30:50.133 20071 ERROR
neutron.services.trunk.drivers.openvswitch.agent.ovsdb_handler [-] Got
messaging error while processing trunk bridge tbr-b56178af-8: Timed out waiting
for a reply to message ID 58a64b2c975143a4bbfd07ab3b10e871:
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to
message ID 58a64b2c975143a4bbfd07ab3b10e871
2019-10-14 12:31:39.851 20071 ERROR
neutron.services.trunk.drivers.openvswitch.agent.driver
[req-85c86e08-52a3-4199-a1af-915f4847e9fc cd9715e9b4714bc6b4d77f15f12ba5a9
1e205eb2989a4beb9ef5947abff00b35 - - -] Error on event deleted for subports
[SubPort(port_id=c048169f-a005-44a3-88e3-03a34d778bb5,segmentation_id=843,segmentation_type='vlan',trunk_id=b56178af-8d6f-4660-ac3b-cc469c3de4ce)]:
Timed out waiting for a reply to message ID 8093a1090d47426380434e65559875e5:
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to
message ID 8093a1090d47426380434e65559875e5
2019-10-14 12:35:26.906 20071 ERROR neutron_lib.rpc
[req-e7ac3037-3598-4003-90db-d59985cf5326 cd9715e9b4714bc6b4d77f15f12ba5a9
1e205eb2989a4beb9ef5947abff00b35 - - -] Timeout in RPC method
trunk.update_subport_bindings. Waiting for 53 seconds before next attempt. If
the server is not down, consider increasing the rpc_response_timeout option as
Neutron server(s) may be overloaded and unable to respond quickly enough.:
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to
message ID b16fcf72d3284439a06f8383a6d04566
2019-10-14 12:35:26.907 20071 WARNING neutron_lib.rpc
[req-e7ac3037-3598-4003-90db-d59985cf5326 cd9715e9b4714bc6b4d77f15f12ba5a9
1e205eb2989a4beb9ef5947abff00b35 - - -] Increasing timeout for
trunk.update_subport_bindings calls to 480 seconds. Restart the agent to
restore it to the default value.: oslo_messaging.exceptions.MessagingTimeout:
Timed out waiting for a reply to message ID b16fcf72d3284439a06f8383a6d04566
2019-10-14 12:36:20.366 20071 ERROR
neutron.services.trunk.drivers.openvswitch.agent.driver
[req-e7ac3037-3598-4003-90db-d59985cf5326 cd9715e9b4714bc6b4d77f15f12ba5a9
1e205eb2989a4beb9ef5947abff00b35 - - -] Error on event created for subports
[SubPort(port_id=c048169f-a005-44a3-88e3-03a34d778bb5,segmentation_id=843,segmentation_type='vlan',trunk_id=b56178af-8d6f-4660-ac3b-cc469c3de4ce)]:
Timed out waiting for a reply to message ID b16fcf72d3284439a06f8383a6d04566:
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to
message ID b16fcf72d3284439a06f8383a
the os/neutron setup we have here is the following:
- 3 Controller Nodes behind HaProxy
- Ubuntu 18.04 Installation with Ubuntu Cloud Archive Repositories (Stein)
(Python 3)
- Neutron ML2 Plugin with OVS Setup
- Provider Networks
- Package Version neutron-common: 2:14.0.2-0ubuntu1~cloud0
- Package Version neutron-plugin-ml2: 2:14.0.2-0ubuntu1~cloud0
- Package Version neutron-server: 2:14.0.2-0ubuntu1~cloud0
- Package Version neutron-openvswitch-agent: 2:14.0.2-0ubuntu1~cloud0
- Package Version neutron-dhcp-agent: 2:14.0.2-0ubuntu1~cloud0
- Package Version openvswitch-common: 2.11.0-0ubuntu2~cloud0
- Package Version openvswitch-switch: 2.11.0-0ubuntu2~cloud0
the port/trunk setup is as followed:
- trunk port belonging to project p1
- parent port belonging to project p1, subnet s1
- subnet s1 belongs to project p1, network n1
- network n1 belongs to project admin and has provider:segmentation_id = 700
- subport belonging to project p1, subnet s2
- subnet s2 belongs to project p1, network n2
- network n2 belongs to project p1, and has provider:segmentation_id = 843
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1848311/+subscriptions
--
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help : https://help.launchpad.net/ListHelp