[ 
https://issues.apache.org/jira/browse/YARN-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15210359#comment-15210359
 ] 

Sunil G commented on YARN-4678:
-------------------------------

Thank you [~leftnoteasy] for sharing the thoughts.

bq.1) reserved resource + allocated resource could excess queue's max capacity, 
maybe we can add a test to make sure it won't happen

With minimum allocation configured to a low number and if the node label 
resource is not perfectly divisible by application demand (one container 
request as 512 MB etc), its possible that used+reserved will reach more than 
100%. Its not very big margin, may be like 108%. [~brahmareddy], if you could 
share the test scenario which you performed,  it ll be great.

I can prepare a test case from YARN side to prove this. I will attach soon.

bq. If we simply deduct reserved resources from used and show on the UI, user 
could find cluster utilization is < 100 in most of the time, and it gonna be 
hard to explain the reason of why it cannot reach 100%.

Currently cluster metrics is shown differently when reservation happens. 
Attaching a screen shot to clarify. {{Refer:reservedCapInClusterMetrics.png}}. 
Here Total capacity is shown as 14 and reserved as 2GB. Total capacity is 16. 
So cluster metrics is already showing reserved separately. But I liked the idea 
of two colors for used and reserved. Its better to do that way. I will check 
this option . However as mentioned in point 1), still total can show more than 
100% unless we fix the corner case. We can fix this from a  stricter check (by 
dividing with minimum allocation there can be few round-off case). 

bq.3) Record reserved resources in ResourceUsage and QueueCapacities separately.
Yes, I will raise a different ticket to handle this.

> Cluster used capacity is > 100 when container reserved 
> -------------------------------------------------------
>
>                 Key: YARN-4678
>                 URL: https://issues.apache.org/jira/browse/YARN-4678
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Brahma Reddy Battula
>            Assignee: Sunil G
>         Attachments: 0001-YARN-4678.patch, 0002-YARN-4678.patch, 
> 0003-YARN-4678.patch
>
>
>  *Scenario:* 
> * Start cluster with Three NM's each having 8GB (cluster memory:24GB).
> * Configure queues with elasticity and userlimitfactor=10.
> * disable pre-emption.
> * run two job with different priority in different queue at the same time
> ** yarn jar hadoop-mapreduce-examples-2.7.2.jar pi -Dyarn.app.priority=LOW 
> -Dmapreduce.job.queuename=QueueA -Dmapreduce.map.memory.mb=4096 
> -Dyarn.app.mapreduce.am.resource.mb=1536 
> -Dmapreduce.job.reduce.slowstart.completedmaps=1.0 10 1000000000000
> ** yarn jar hadoop-mapreduce-examples-2.7.2.jar pi -Dyarn.app.priority=HIGH 
> -Dmapreduce.job.queuename=QueueB -Dmapreduce.map.memory.mb=4096 
> -Dyarn.app.mapreduce.am.resource.mb=1536 3 1000000000000
> * observe the cluster capacity which was used in RM web UI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to