[ 
https://issues.apache.org/jira/browse/YARN-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207251#comment-14207251
 ] 

Karthik Kambatla commented on YARN-2604:
----------------------------------------

Looks mostly good. Just want to confirm - when there are no nodes connected to 
the RM, the patch sets the max-allocation to the configured value and not zero. 
I think this is good, otherwise all apps will get rejected immediately after 
the RM (re)starts. Actually, I wonder if we should add a config to specify 
either (a) a particular number of NMs after which this behavior kicks in or (b) 
a minimum/floor value for the configurable maximum (min-max-allocation :P). 
[~jlowe] - do you think such a config would be useful? 

Few comments on the patch itself: 
# We should have tests similar to TestFifoScheduler#testMaximumAllocation for 
Capacity and FairSchedulers.
# Nit: Rename AbstractYarnScheduler#realMaximumAllocation to 
configuredMaximumAllocation? And, in all the schedulers, we should set 
configuredMaximumAllocation first and then set maximumAllocation to that. Also, 
given both these fields are in AbstractYarnScheduler, I wouldn't refer to them 
using {{this.}} in the sub-classes.
# Nit: With locks and unlocks, we follow the following convention in YARN. Mind 
updating accordingly? 
{code}
lock.lock();
try {
// do your thing
} finally { 
  lock.unlock();
}
{code}

> Scheduler should consider max-allocation-* in conjunction with the largest 
> node
> -------------------------------------------------------------------------------
>
>                 Key: YARN-2604
>                 URL: https://issues.apache.org/jira/browse/YARN-2604
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: scheduler
>    Affects Versions: 2.5.1
>            Reporter: Karthik Kambatla
>            Assignee: Robert Kanter
>         Attachments: YARN-2604.patch, YARN-2604.patch, YARN-2604.patch
>
>
> If the scheduler max-allocation-* values are larger than the resources 
> available on the largest node in the cluster, an application requesting 
> resources between the two values will be accepted by the scheduler but the 
> requests will never be satisfied. The app essentially hangs forever. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to