[ 
https://issues.apache.org/jira/browse/YARN-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14963454#comment-14963454
 ] 

Karthik Kambatla commented on YARN-4270:
----------------------------------------

Thanks for reporting and working on this, Arun. The approach looks good to me. 
Few minor comments on the patch:
# Can we call the config and variables {{reservableNodes}}? Calling it 
{{appReservationThreshold}} increases the ambiguity with SLA-reservations. 
# In {{FairScheduler#removeNode}}, can we log an error in the else block. i.e., 
when a rack is not found in the app or has a node-count <=0? 
# FSAppAttempt: can we add a comment to describe what the map holds? i.e., what 
String corresponds to? 
{code}
  private Map<String, Set<String>> reservations = new HashMap<>();
{code}
# Should we use a smaller value for the default? Consider a cluster with 2k 
nodes: with this default, we end up reserving on 1000 nodes. How about using 
0.05?
# I believe we should be reserving nodes for an app only after the locality of 
the corresponding RR is fully relaxed. That is likely for another JIRA. 
# Nit: Spurious changes in FairSchedulerConfiguration

> Limit application resource reservation on nodes for non-node/rack specific 
> requests
> -----------------------------------------------------------------------------------
>
>                 Key: YARN-4270
>                 URL: https://issues.apache.org/jira/browse/YARN-4270
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>         Attachments: YARN-4270.1.patch, YARN-4270.2.patch, YARN-4270.3.patch, 
> YARN-4270.4.patch
>
>
> I has been noticed that for off-switch requests, the FairScheduler reserves 
> resources on all nodes. This could lead to the entire cluster being 
> unavailable for all other applications.
> Ideally, the reservations should be on a configurable number of nodes, 
> default 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to