[ 
https://issues.apache.org/jira/browse/YARN-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308238#comment-15308238
 ] 

Karthik Kambatla commented on YARN-5077:
----------------------------------------

My bad again. I should probably take some time off. {{maxShare}} will be 
Integer.MAX_VALUE, but that is also an issue. What happens if the cluster 
resources are smaller than maxShare? Wouldn't we run into the same livelock 
issue maxAMShare was meant to solve? 

Given the number of issues surrounding this code, I wonder if there is a 
fundamental issue here that needs a more comprehensive look. 

> Fix FSLeafQueue#getFairShare() for queues with weight 0.0
> ---------------------------------------------------------
>
>                 Key: YARN-5077
>                 URL: https://issues.apache.org/jira/browse/YARN-5077
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Yufei Gu
>            Assignee: Yufei Gu
>         Attachments: YARN-5077.001.patch, YARN-5077.002.patch, 
> YARN-5077.003.patch, YARN-5077.004.patch, YARN-5077.005.patch, 
> YARN-5077.006.patch
>
>
> 1) When a queue's weight is set to 0.0, FSLeafQueue#getFairShare() returns 
> <memory:0, vCores:0> 
> 2) When a queue's weight is nonzero, FSLeafQueue#getFairShare() returns 
> <memory:16384, vCores:8>
> In case 1), that means no container ever gets allocated for an AM because 
> from the viewpoint of the RM, there is never any headroom to allocate a 
> container on that queue.
> For example, we have a pool with the following weights: 
> - root.dev 0.0 
> - root.product 1.0
> The root.dev is a best effort pool and should only get resources if 
> root.product is not running. In our tests, with no jobs running under 
> root.product, jobs started in root.dev queue stay stuck in ACCEPT phase and 
> never start.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to