[
https://issues.apache.org/jira/browse/YARN-5555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eric Payne updated YARN-5555:
-----------------------------
Attachment: PctOfQueueIsInaccurate.jpg
The queue structure for the attached screenshot (PctOfQueueIsInnaccurate.jpg)
has the following attributes:
||Cluster Capacity||root.swords.capacity||root.swords.brisingr.capacity||
|12288 MB|20%|25%|
There are 3 apps running in the {{root.swords.brisingr}} queue. The attributes
for each of these apps are as follows:
||App Name||Allocated Memory MB||% of Queue||
|application_1471969002932_0001|4608 MB|150.0|
|application_1471969002932_0002|4608 MB|150.0|
|application_1471969002932_0003|3072 MB|100.0|
The value to the right of the {{Queue: swords.brisingr}} bar graph says that
the queue is 2001.3% used. This value is (almost) accurate because the actual
memory allocation allotted to {{root.swords.brisingr}} is {{12288 MB * 20% *
25% = 614.4 MB}}. Since {{root.swords.brisingr}} is consuming all 12288 MB,
{{12288 MB / 614.4 MB = 20 * 100% = 2000%}}
However, the sum of the {{% of Queue}} column for all apps running in
{{root.swords.brisingr}} is {{100.0% + 150.0% + 150.0% = 400%}}. This is
inaccurate.
It appears as if the calculations are not taking into account the capacity of
the parent queue, {{root.swords: 20%}}. For
example,{{application_1471969002932_0001}}'s usage is 4608 MB, and {{12288 MB *
25% = 3072 MB}}, and {{4608 / 3072 = 1.5 * 100% = 150%}}. This calculation
should have been {{4608 / 614.4 = 7.5 * 100% = 750%}}.
{{RMAppsBlock#renderData}} is calling {{ApplicationResourceUsageReport}}, which
eventually calls {{SchedulerApplicationAttempt#getResourceUsageReport}}.
The following code in {{getResourceUsageReport}}, I think, needs to walk back
up the parent tree to get all of the capacity values, not just the one for the
leaf queue:
{code}
queueUsagePerc =
calc.divide(cluster, usedResourceClone, Resources.multiply(cluster,
queue.getQueueInfo(false, false).getCapacity())) * 100;
{code}
> Scheduler UI: "% of Queue" is inaccurate if leaf queue is hierarchically
> nested.
> --------------------------------------------------------------------------------
>
> Key: YARN-5555
> URL: https://issues.apache.org/jira/browse/YARN-5555
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.8.0
> Reporter: Eric Payne
> Assignee: Eric Payne
> Priority: Minor
> Attachments: PctOfQueueIsInaccurate.jpg
>
>
> If a leaf queue is hierarchically nested (e.g., {{root.a.a1}},
> {{root.a.a2}}), the values in the "*% of Queue*" column in the apps section
> of the Scheduler UI is calculated as if the leaf queue ({{a1}}) were a direct
> child of {{root}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]