Public bug reported: This is based on some performance and scale testing done by Huawei, reported in this dev ML thread:
http://lists.openstack.org/pipermail/openstack- dev/2018-August/133363.html In that scenario, they have 10 cells with 10000 instances in each cell. They then run through a few GET /servers/detail scenarios with multiple cells and varying limits. The thread discussion pointed out that they were wasting time pulling 1000 records (the default [api]/max_limit) from all 10 cells and then throwing away 9000 of those results, so the DB query time per cell was small, but the sqla/ORM/python was chewing up the time. Dan Smith has a series of changes here: https://review.openstack.org/#/q/topic:batched-inst- list+(status:open+OR+status:merged) Which allow us to batch the DB queries per cell which, when distributed across the 10 cells, e.g. 1000 / 10 = 100 batch size per cell, ends up cutting the time spent in about half (around 11 sec to around 6 sec). This is clearly a performance issue which we have a fix, and we arguably should backport the fix. Note this is less of an issue for deployments that leverage the [api]/instance_list_per_project_cells option (like CERN): https://docs.openstack.org/nova/latest/configuration/config.html#api.instance_list_per_project_cells ** Affects: nova Importance: Medium Assignee: Dan Smith (danms) Status: Triaged ** Affects: nova/queens Importance: Undecided Status: New ** Affects: nova/rocky Importance: Undecided Status: New ** Tags: api cells performance ** Also affects: nova/queens Importance: Undecided Status: New ** Also affects: nova/rocky Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1787977 Title: Inefficient multi-cell instance list Status in OpenStack Compute (nova): Triaged Status in OpenStack Compute (nova) queens series: New Status in OpenStack Compute (nova) rocky series: New Bug description: This is based on some performance and scale testing done by Huawei, reported in this dev ML thread: http://lists.openstack.org/pipermail/openstack- dev/2018-August/133363.html In that scenario, they have 10 cells with 10000 instances in each cell. They then run through a few GET /servers/detail scenarios with multiple cells and varying limits. The thread discussion pointed out that they were wasting time pulling 1000 records (the default [api]/max_limit) from all 10 cells and then throwing away 9000 of those results, so the DB query time per cell was small, but the sqla/ORM/python was chewing up the time. Dan Smith has a series of changes here: https://review.openstack.org/#/q/topic:batched-inst- list+(status:open+OR+status:merged) Which allow us to batch the DB queries per cell which, when distributed across the 10 cells, e.g. 1000 / 10 = 100 batch size per cell, ends up cutting the time spent in about half (around 11 sec to around 6 sec). This is clearly a performance issue which we have a fix, and we arguably should backport the fix. Note this is less of an issue for deployments that leverage the [api]/instance_list_per_project_cells option (like CERN): https://docs.openstack.org/nova/latest/configuration/config.html#api.instance_list_per_project_cells To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1787977/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

