Reviewed: https://review.opendev.org/651947 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5c544c7e2a7e266d69a9d0f0bf3ee8a0c636202b Submitter: Zuul Branch: master
commit 5c544c7e2a7e266d69a9d0f0bf3ee8a0c636202b Author: melanie witt <[email protected]> Date: Fri Apr 12 00:32:01 2019 +0000 Warn for duplicate host mappings during discover_hosts When the 'nova-manage cellv2 discover_hosts' command is run in parallel during a deployment, it results in simultaneous attempts to map the same compute or service hosts at the same time, resulting in tracebacks: "DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, u\"Duplicate entry 'compute-0.localdomain' for key 'uniq_host_mappings0host'\") [SQL: u'INSERT INTO host_mappings (created_at, updated_at, cell_id, host) VALUES (%(created_at)s, %(updated_at)s, %(cell_id)s, %(host)s)'] [parameters: {'host': u'compute-0.localdomain', %'cell_id': 5, 'created_at': datetime.datetime(2019, 4, 10, 15, 20, %50, 527925), 'updated_at': None}] This adds more information to the command help and adds a warning message when duplicate host mappings are detected with guidance about how to run the command. The command will return 2 if a duplicate host mapping is encountered and the documentation is updated to explain this. This also adds a warning to the scheduler periodic task to recommend enabling the periodic on only one scheduler to prevent collisions. We choose to warn and stop instead of ignoring DBDuplicateEntry because there could potentially be a large number of parallel tasks competing to insert duplicate records where only one can succeed. If we ignore and continue to the next record, the large number of tasks will repeatedly collide in a tight loop until all get through the entire list of compute hosts that are being mapped. So we instead stop the colliding task and emit a message. Closes-Bug: #1824445 Change-Id: Ia7718ce099294e94309103feb9cc2397ff8f5188 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1824445 Title: nova-manage cellv2 discover_hosts traces when run in parallel Status in OpenStack Compute (nova): Fix Released Bug description: Saw this issue downstream [1] and found a couple of similar issues [2][3] where deployments were running the 'nova-manage cellv2 discover_hosts' command in parallel and experiencing tracebacks like this: "DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, u\"Duplicate entry 'compute-0.localdomain' for key 'uniq_host_mappings0host'\") [SQL: u'INSERT INTO host_mappings (created_at, updated_at, cell_id, host) VALUES (%(created_at)s, %(updated_at)s, %(cell_id)s, %(host)s)'] [parameters: {'host': u'compute-0.localdomain', 'cell_id': 5, 'created_at': datetime.datetime(2019, 4, 10, 15, 20, 50, 527925), 'updated_at': None}] (Background on this error at: http://sqlalche.me/e/gkpj)", After some discussion on IRC today [4], we concluded it would be best to address the situation with improved command help and warnings when duplicate host mappings are encountered. While we could try-except to ignore DBDuplicateEntry, this is not a situation we want to hide from users as it means they are likely hammering their database with parallel updates that are mostly not going to succeed. So we should stop and warn instead. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1698630 [2] https://bugs.launchpad.net/openstack-ansible/+bug/1752540 [3] https://github.com/bloomberg/chef-bcpc/issues/1378 [4] http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2019-04-11.log.html#t2019-04-11T23:04:47 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1824445/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

