[ 
https://issues.apache.org/jira/browse/YARN-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147942#comment-14147942
 ] 

Jason Lowe commented on YARN-2604:
----------------------------------

Ah, I see, yes they're a little bit different.  They'd be the same if we want 
to consider the large node that is unhealthy/lost equivalent to an overloaded 
large node.  In both cases we had the resources to satisfy the request at one 
point but no longer do.

I guess it comes down to whether we really want to immediately fail an app if 
no node in the cluster at the time of submission has the sufficient resources.  
If that's OK then we can do a simple change like the one you originally 
proposed.  If the nodes are there but unusable for some reason (e.g.: 
unhealthy) and we want to wait around for a bit then it gets closer to what 
YARN-56 is trying to do.

> Scheduler should consider max-allocation-* in conjunction with the largest 
> node
> -------------------------------------------------------------------------------
>
>                 Key: YARN-2604
>                 URL: https://issues.apache.org/jira/browse/YARN-2604
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: scheduler
>    Affects Versions: 2.5.1
>            Reporter: Karthik Kambatla
>            Assignee: Karthik Kambatla
>
> If the scheduler max-allocation-* values are larger than the resources 
> available on the largest node in the cluster, an application requesting 
> resources between the two values will be accepted by the scheduler but the 
> requests will never be satisfied. The app essentially hangs forever. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to