The code referred is no longer present in the master branch. Closing
this bug.

** Changed in: neutron
       Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1687913

Title:
  db retry not triggered when fail happened in after_create notify

Status in neutron:
  Won't Fix

Bug description:
  Note: 
  - The specific use case can no longer happen on master (due to a couple of 
commits). So the below is for a < ocata context.
  - Bug seen on Newton setup

  During high concurrency testing (with router:external networks) the following 
deadlock may occur
  http://paste.openstack.org/show/608690/

  Deadlocks are normally 'okay', because the db retry mechanism will
  retry the request. But in this specific case it did not.

  The issue happens here:
  
https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/plugin.py#L769

  - It's inside of a transaction
  - the external_net_db code does a notify with AFTER_CREATE.
  - in the AFTER_CREATE even processing, the deadlock happens 

  The problem is that an AFTER_CREATE event will not raise exceptions. It just 
logs. 
  But it IS inside of a transaction, and it did make the session invalid.

  So the code continues, it tries to commit the invalid session. And the
  resulting exception of this is a

  sqlalchemy.exc.InvalidRequestError  - This Session's transaction has
  been rolled back due to a previous exception during flush. To begin a
  new transaction with this Session, first issue Session.rollback().
  Original exception was: ...

  Since this exception type is not part of the db_retry exceptions, no
  retry happens and the request fails.

  
  While this use case is a very specific one. Maybe some action is needed to 
avoid something like this happening in other places. Because any database error 
which occurs inside of an event notify which is not BEFORE_x or PRECOMMIT will 
have this behaviour: corrupt the session object, nothing raises, and the 
following error is not retriable.


  (to easily reproduce on a test setup: add

      if event == events.AFTER_CREATE:
          try:
              context.session.add(models_v2.Network(name=256*'g'))
              context.session.flush() # this makes the session invalid
          except:
              raise db_exc.DBDeadlock()

  
  to _ensure_external_network_default_value_callback in 
neutron.services.auto_allocate.db.py
  and create a router:external network.

  This should trigger the retry mechanism at first sight, but it won't.)

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1687913/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to