[
https://issues.apache.org/jira/browse/YARN-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15210359#comment-15210359
]
Sunil G commented on YARN-4678:
-------------------------------
Thank you [~leftnoteasy] for sharing the thoughts.
bq.1) reserved resource + allocated resource could excess queue's max capacity,
maybe we can add a test to make sure it won't happen
With minimum allocation configured to a low number and if the node label
resource is not perfectly divisible by application demand (one container
request as 512 MB etc), its possible that used+reserved will reach more than
100%. Its not very big margin, may be like 108%. [~brahmareddy], if you could
share the test scenario which you performed, it ll be great.
I can prepare a test case from YARN side to prove this. I will attach soon.
bq. If we simply deduct reserved resources from used and show on the UI, user
could find cluster utilization is < 100 in most of the time, and it gonna be
hard to explain the reason of why it cannot reach 100%.
Currently cluster metrics is shown differently when reservation happens.
Attaching a screen shot to clarify. {{Refer:reservedCapInClusterMetrics.png}}.
Here Total capacity is shown as 14 and reserved as 2GB. Total capacity is 16.
So cluster metrics is already showing reserved separately. But I liked the idea
of two colors for used and reserved. Its better to do that way. I will check
this option . However as mentioned in point 1), still total can show more than
100% unless we fix the corner case. We can fix this from a stricter check (by
dividing with minimum allocation there can be few round-off case).
bq.3) Record reserved resources in ResourceUsage and QueueCapacities separately.
Yes, I will raise a different ticket to handle this.
> Cluster used capacity is > 100 when container reserved
> -------------------------------------------------------
>
> Key: YARN-4678
> URL: https://issues.apache.org/jira/browse/YARN-4678
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Brahma Reddy Battula
> Assignee: Sunil G
> Attachments: 0001-YARN-4678.patch, 0002-YARN-4678.patch,
> 0003-YARN-4678.patch
>
>
> *Scenario:*
> * Start cluster with Three NM's each having 8GB (cluster memory:24GB).
> * Configure queues with elasticity and userlimitfactor=10.
> * disable pre-emption.
> * run two job with different priority in different queue at the same time
> ** yarn jar hadoop-mapreduce-examples-2.7.2.jar pi -Dyarn.app.priority=LOW
> -Dmapreduce.job.queuename=QueueA -Dmapreduce.map.memory.mb=4096
> -Dyarn.app.mapreduce.am.resource.mb=1536
> -Dmapreduce.job.reduce.slowstart.completedmaps=1.0 10 1000000000000
> ** yarn jar hadoop-mapreduce-examples-2.7.2.jar pi -Dyarn.app.priority=HIGH
> -Dmapreduce.job.queuename=QueueB -Dmapreduce.map.memory.mb=4096
> -Dyarn.app.mapreduce.am.resource.mb=1536 3 1000000000000
> * observe the cluster capacity which was used in RM web UI
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)