There is an online data migration:

https://review.openstack.org/#/c/377138/62/nova/objects/resource_provider.py@917

But it's only when listing/showing resource providers. The allocation
candidates code must be getting the providers and relying on the
root_provider_id using sqla model objects rather than the versioned
objects that do the online data migration.

This is where something like "placement-manage db
online_data_migrations" would be useful.

** Changed in: nova
       Status: New => Triaged

** Changed in: nova
   Importance: Undecided => Medium

** Also affects: nova/queens
   Importance: Undecided
       Status: New

** Also affects: nova/rocky
   Importance: Undecided
       Status: New

** No longer affects: nova/queens

** Changed in: nova/rocky
       Status: New => Triaged

** Changed in: nova/rocky
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1799892

Title:
  Placement API crashes with 500s in Rocky upgrade with downed compute
  nodes

Status in OpenStack Compute (nova):
  Triaged
Status in OpenStack Compute (nova) rocky series:
  Triaged

Bug description:
  I ran into this upgrading another environment into Rocky, deleted the
  problematic resource provider, but just ran into it again in another
  upgrade of another environment so there's something wonky.  Here's the
  traceback:

  =============
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap 
[req-8ad1c999-7646-4b0a-91c0-cd26a3581766 b61d42657d364008bfdc6fa715e67daf 
a894e8109af3430aa7ae03e0c49a0aa0 - default default] Placement API unexpected 
error: 19: KeyError: 19
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap 
Traceback (most recent call last):
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File 
"/usr/lib/python2.7/site-packages/nova/api/openstack/placement/fault_wrap.py", 
line 40, in __call__
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 return self.application(environ, start_response)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File "/usr/lib/python2.7/site-packages/webob/dec.py", line 129, in __call__
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 resp = self.call_func(req, *args, **kw)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File "/usr/lib/python2.7/site-packages/webob/dec.py", line 193, in call_func
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 return self.func(req, *args, **kwargs)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File "/usr/lib/python2.7/site-packages/microversion_parse/middleware.py", line 
80, in __call__
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 response = req.get_response(self.application)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File "/usr/lib/python2.7/site-packages/webob/request.py", line 1313, in send
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 application, catch_exc_info=False)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File "/usr/lib/python2.7/site-packages/webob/request.py", line 1277, in 
call_application
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 app_iter = application(self.environ, start_response)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File 
"/usr/lib/python2.7/site-packages/nova/api/openstack/placement/handler.py", 
line 209, in __call__
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 return dispatch(environ, start_response, self._map)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File 
"/usr/lib/python2.7/site-packages/nova/api/openstack/placement/handler.py", 
line 146, in dispatch
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 return handler(environ, start_response)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File "/usr/lib/python2.7/site-packages/webob/dec.py", line 129, in __call__
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 resp = self.call_func(req, *args, **kw)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File 
"/usr/lib/python2.7/site-packages/nova/api/openstack/placement/wsgi_wrapper.py",
 line 29, in call_func
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 super(PlacementWsgify, self).call_func(req, *args, **kwargs)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File "/usr/lib/python2.7/site-packages/webob/dec.py", line 193, in call_func
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 return self.func(req, *args, **kwargs)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File 
"/usr/lib/python2.7/site-packages/nova/api/openstack/placement/microversion.py",
 line 164, in decorated_func
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 return _find_method(f, version, status_code)(req, *args, **kwargs)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/util.py", 
line 81, in decorated_function
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 return f(req)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File 
"/usr/lib/python2.7/site-packages/nova/api/openstack/placement/handlers/allocation_candidate.py",
 line 316, in list_allocation_candidates
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 context, requests, limit=limit, group_policy=group_policy)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File 
"/usr/lib/python2.7/site-packages/nova/api/openstack/placement/objects/resource_provider.py",
 line 3965, in get_by_requests
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 context, requests, limit=limit, group_policy=group_policy)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", 
line 993, in wrapper
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 return fn(*args, **kwargs)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File 
"/usr/lib/python2.7/site-packages/nova/api/openstack/placement/objects/resource_provider.py",
 line 4071, in _get_by_requests
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 context, request, sharing, has_trees)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File 
"/usr/lib/python2.7/site-packages/nova/api/openstack/placement/objects/resource_provider.py",
 line 4045, in _get_by_one_request
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 return _alloc_candidates_single_provider(context, resources, rp_ids)
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap   
File 
"/usr/lib/python2.7/site-packages/nova/api/openstack/placement/objects/resource_provider.py",
 line 3490, in _alloc_candidates_single_provider
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap    
 rp_summary = summaries[rp_id]
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap 
KeyError: 19
  2018-10-25 09:18:29.853 7431 ERROR nova.api.openstack.placement.fault_wrap 
  =============

  The resource provider (nova-compute) with ID 19 was down during the
  upgrade (it was put down for a long time ago).  The only oddities I
  found was in the database, `root_provider_id` was set to NULL for that
  record too.  Upon deleting the resource provider, the placement API
  stopped giving 500s when it tried to schedule new VMs.

  In the other environment that had a problem too, it actually was the
  downed instance as well.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1799892/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to