[ 
https://issues.apache.org/jira/browse/YARN-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147934#comment-14147934
 ] 

Karthik Kambatla commented on YARN-2604:
----------------------------------------

Actually, I think the JIRA here is slightly different from the one reported on 
YARN-56. 

IIUC, YARN-56 wants to tackle the case where resources on a node are large 
enough to accommodate the request, but these resources are (partially) taken by 
other applications and are "currently unavailable". Using a timeout, as 
suggested, seems like a reasonable approach. 

This JIRA was meant to handle the case where there is no node (even if it were 
to be free) that can accommodate the request. This case can be partially fixed 
through better configuration - set max-allocation-mb to a value less than or 
equal to the most memory available on a node. However, if that largest node 
fails, the config will be outdated. We could either handle this separately or 
just fallback on YARN-56.

Thoughts? 

> Scheduler should consider max-allocation-* in conjunction with the largest 
> node
> -------------------------------------------------------------------------------
>
>                 Key: YARN-2604
>                 URL: https://issues.apache.org/jira/browse/YARN-2604
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: scheduler
>    Affects Versions: 2.5.1
>            Reporter: Karthik Kambatla
>            Assignee: Karthik Kambatla
>
> If the scheduler max-allocation-* values are larger than the resources 
> available on the largest node in the cluster, an application requesting 
> resources between the two values will be accepted by the scheduler but the 
> requests will never be satisfied. The app essentially hangs forever. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to