Yeah, I'll put it to the Neutron team to ask them to look at this bug.
In case they say it's a Nova bug, please modify the nova status to "New" again.

Thanks.

** Also affects: neutron
   Importance: Undecided
       Status: New

** Changed in: nova
       Status: New => Opinion

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1951720

Title:
  Virtual interface creation failed

Status in neutron:
  New
Status in OpenStack Compute (nova):
  Opinion

Bug description:
  Hi,

  I have testig stack of openstack wallaby (deployed with kolla-ansible with 
kolla source images)
  and found probably weird bug in nova/neutron.

  I have testing heat template which was starting about 6 instances with
  bunch of network interfaces with security group - nothing special.
  Testing openstack ENV is clean, fresh installed, tempest passing. So
  what is going on ?

  1. Sometimes heat stack is created successfully without error
  2. Sometimes heat stack is created sucessfully - BUT with errors in 
nova-compute - so retry 
     mechanism works

  Errors in nova-compute :

  7d241eaae5bb137aedc6fcc] [instance: 69b8dd7e-a9db-44fb-ab41-832942cb9e7e] 
Took 0.10 seconds to destroy the instance on the hypervisor.
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager 
[req-5b21ebad-ffb7-46b0-8c37-fd665d01013e 
64fe2842ff8c6302c0450bee25600a10e54f2b9793e9c8776f956c993a7a7ee8 
0960461696f64f82ba108f8397bf508c - e01e19b257d241eaae5bb137aedc6fcc e01e19b2
  57d241eaae5bb137aedc6fcc] [instance: 69b8dd7e-a9db-44fb-ab41-832942cb9e7e] 
Failed to allocate network(s): nova.exception.VirtualInterfaceCreateException: 
Virtual Interface creation failed
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e] Traceback (most recent call last):
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e]  File 
"/usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py", line 6930, in 
_create_guest_with_network
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e]    guest = self._create_guest(
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e]  File "/usr/lib/python3.9/contextlib.py", 
line 124, in __exit__
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e]    next(self.gen)
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e]  File 
"/usr/lib/python3/dist-packages/nova/compute/manager.py", line 479, in 
wait_for_instance_event
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e]    actual_event = event.wait()
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e]  File 
"/usr/lib/python3/dist-packages/eventlet/event.py", line 125, in wait
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e]    result = hub.switch()
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e]  File 
"/usr/lib/python3/dist-packages/eventlet/hubs/hub.py", line 313, in switch
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e]    return self.greenlet.switch()
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e] eventlet.timeout.Timeout: 300 seconds
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e] 
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e] During handling of the above exception, 
another exception occurred:
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e] 
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e] Traceback (most recent call last):
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e]  File 
"/usr/lib/python3/dist-packages/nova/compute/manager.py", line 2366, in 
_build_and_run_instance
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e]    self.driver.spawn(context, instance, 
image_meta,
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e]  File 
"/usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py", line 3885, in 
spawn
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e]    self._create_guest_with_network(
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e]  File 
"/usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py", line 6956, in 
_create_guest_with_network
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e]    raise 
exception.VirtualInterfaceCreateException()
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e] 
nova.exception.VirtualInterfaceCreateException: Virtual Interface creation 
failed
  2021-11-03 14:56:42.453 7 ERROR nova.compute.manager [instance: 
69b8dd7e-a9db-44fb-ab41-832942cb9e7e]


  So, it looks like nova didn't received message from neutron that vif
  is plugged in in 300 seconds.

  Then I patched nova code with this review, so I can see times of this
  event.

  https://review.opendev.org/c/openstack/nova/+/817030


  
  Logs here : 

  2021-11-09 14:07:07.251 9 DEBUG nova.compute.manager 
[req-ff47242a-ec1e-4f3e-8989-dc17b9b66702 612fb894e80a454881a7ffd6ff3565e7 
3ea2f30b9a7445ccb6151d27d65e9c8b - default default] VIF plugging completed in 
280 seconds wait_for_instance_event 
/usr/lib/python3/dist-packages/nova/compute/manager.py:490
  2021-11-09 14:08:49.726 9 DEBUG nova.compute.manager 
[req-6f3db32a-0c79-4d70-b61f-018e5bb6047e 612fb894e80a454881a7ffd6ff3565e7 
3ea2f30b9a7445ccb6151d27d65e9c8b - default default] VIF plugging completed in 8 
seconds wait_for_instance_event 
/usr/lib/python3/dist-packages/nova/compute/manager.py:490
  2021-11-09 14:08:53.898 9 DEBUG nova.compute.manager 
[req-10eb5d9e-e534-4f41-b7b8-019455cd7df2 612fb894e80a454881a7ffd6ff3565e7 
3ea2f30b9a7445ccb6151d27d65e9c8b - default default] VIF plugging completed in 
10 seconds wait_for_instance_event 
/usr/lib/python3/dist-packages/nova/compute/manager.py:490
  2021-11-09 14:09:29.747 9 DEBUG nova.compute.manager 
[req-6f6b3769-46a9-4188-8db3-fdc7a1f33c5d 612fb894e80a454881a7ffd6ff3565e7 
3ea2f30b9a7445ccb6151d27d65e9c8b - default default] VIF plugging completed in 
46 seconds wait_for_instance_event 
/usr/lib/python3/dist-packages/nova/compute/manager.py:490
  2021-11-09 14:09:31.846 9 DEBUG nova.compute.manager 
[req-dd2d6dd5-5197-4c21-8f7f-48c58b2aa397 612fb894e80a454881a7ffd6ff3565e7 
3ea2f30b9a7445ccb6151d27d65e9c8b - default default] VIF plugging completed in 
43 seconds wait_for_instance_event 
/usr/lib/python3/dist-packages/nova/compute/manager.py:490
  2021-11-09 14:09:36.003 9 DEBUG nova.compute.manager 
[req-2e836d62-d368-45a9-bcbd-6162b126d556 612fb894e80a454881a7ffd6ff3565e7 
3ea2f30b9a7445ccb6151d27d65e9c8b - default default] VIF plugging completed in 
48 seconds wait_for_instance_event 
/usr/lib/python3/dist-packages/nova/compute/manager.py:490



  
  Once again, that openstack is clean, fresh installed , no instance running, 
so why there is VIF plugging completed from 8 sec to 280 seconds (and 
potentionally more) ?

  It looks like there is some race condition , but I don't know where to
  find it.

  When I bump timeout for vif plugging in nova compute below ,
  everything is working well, but it's only workaround.

  vif_plugging_timeout = 3600

  Last thing what I see is that it is something with security group, if
  noop driver is used for security group, I can't reproduce issue, so
  it's "fixed".

  
  Testing environment using OVS (not OVN) , and firewall driver openvswitch.

  Thank you very much,
  Michal Arbet (kekvo)

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1951720/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to