Reviewed: https://review.openstack.org/592654 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=2d7ed309ec4e656ce9d6f21f03ea158278f2526d Submitter: Zuul Branch: master
commit 2d7ed309ec4e656ce9d6f21f03ea158278f2526d Author: Jay Pipes <[email protected]> Date: Thu Aug 16 14:56:47 2018 -0400 placement: use single-shot INSERT/DELETE agg When replacing a provider's set of aggregate associations, we were issuing a call to: DELETE resource_provider_aggregates WHERE resource_provider_id = $rp and then a single call to: INSERT INTO resource_provider_aggregates SELECT $rp, aggs.id FROM provider_aggregates AS aggs WHERE aggs.uuid IN ($agg_uuids) This patch changes the _set_aggregates() function in a few ways. First, we grab the aggregate's internal ID value when creating new aggregate records (or grabbing a provider's existing aggregate associations). This eliminates the need for any join to provider_aggregates in an INSERT/DELETE statement. Second, instead of a multi-row INSERT .. SELECT statement, we do single-shot INSERT ... VALUES statements, one for each added aggregate. Third, we no longer DELETE all aggregate associations for the provider in question. Instead, we issue single-shot DELETE statements for only the aggregates that are being disassociated. Finally, I've added a number of log debug statements so that we can have a little more information if this particular patch does not fix the deadlock issue described in the associated bug. Change-Id: I87e765305017eae1424005f7d6f419f42a2f8370 Closes-bug: #1786703 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1786703 Title: Placement duplicate aggregate uuid handling during concurrent aggregate create insufficiently robust Status in OpenStack Compute (nova): Fix Released Bug description: NOTE: This may be just a postgresql problem, not sure. When doing some further experiments with load testing placement, my resource provider create script, which uses asyncio was able to cause several 500 errors from the placement service of the following form: ``` cdent-a01:~/src/placeload(master) $ docker logs zen_murdock |grep 'req-d4dcbfed-b050-4a3b-ab0f-d2489a31c3f2' 2018-08-12 16:03:30.698 9 DEBUG nova.api.openstack.placement.requestlog [req-d4dcbfed-b050-4a3b-ab0f-d2489a31c3f2 admin admin - - -] Starting request: 172.17.0.1 "PUT /resource_providers/13b09bc9-164f-4d03-8a61-5e78c05a73ad/aggregates" __call__ /usr/lib/python3.6/site-packages/nova/api/openstack/placement/requestlog.py:38 2018-08-12 16:03:30.903 9 ERROR nova.api.openstack.placement.fault_wrap [req-d4dcbfed-b050-4a3b-ab0f-d2489a31c3f2 admin admin - - -] Placement API unexpected error: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (psycopg2.IntegrityError) duplicate key value violates unique constraint "uniq_placement_aggregates0uuid" 2018-08-12 16:03:30.914 9 INFO nova.api.openstack.placement.requestlog [req-d4dcbfed-b050-4a3b-ab0f-d2489a31c3f2 admin admin - - -] 172.17.0.1 "PUT /resource_providers/13b09bc9-164f-4d03-8a61-5e78c05a73ad/aggregates" status: 500 len: 997 microversion: 1.29 ``` "DETAIL: Key (uuid)=(14a5c8a3-5a99-4e8f-88be-00d85fcb1c17) already exists." This is because the code at https://github.com/openstack/nova/blob/a29ace1d48b5473b9e7b5decdf3d5d19f3d262f3/nova/api/openstack/placement/objects/resource_provider.py#L519-L529 is not trapping the right error when the server thinks it needs to create a new aggregate at the same time that it is already creating it. It's not clear to me if this is because oslo_db is not transforming the postgresql error properly or that the generic error there is the wrong one and we've never noticed before because we don't hit the concurrency situation hard enough. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1786703/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

