[
https://issues.apache.org/jira/browse/YARN-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14963454#comment-14963454
]
Karthik Kambatla commented on YARN-4270:
----------------------------------------
Thanks for reporting and working on this, Arun. The approach looks good to me.
Few minor comments on the patch:
# Can we call the config and variables {{reservableNodes}}? Calling it
{{appReservationThreshold}} increases the ambiguity with SLA-reservations.
# In {{FairScheduler#removeNode}}, can we log an error in the else block. i.e.,
when a rack is not found in the app or has a node-count <=0?
# FSAppAttempt: can we add a comment to describe what the map holds? i.e., what
String corresponds to?
{code}
private Map<String, Set<String>> reservations = new HashMap<>();
{code}
# Should we use a smaller value for the default? Consider a cluster with 2k
nodes: with this default, we end up reserving on 1000 nodes. How about using
0.05?
# I believe we should be reserving nodes for an app only after the locality of
the corresponding RR is fully relaxed. That is likely for another JIRA.
# Nit: Spurious changes in FairSchedulerConfiguration
> Limit application resource reservation on nodes for non-node/rack specific
> requests
> -----------------------------------------------------------------------------------
>
> Key: YARN-4270
> URL: https://issues.apache.org/jira/browse/YARN-4270
> Project: Hadoop YARN
> Issue Type: Bug
> Components: fairscheduler
> Reporter: Arun Suresh
> Assignee: Arun Suresh
> Attachments: YARN-4270.1.patch, YARN-4270.2.patch, YARN-4270.3.patch,
> YARN-4270.4.patch
>
>
> I has been noticed that for off-switch requests, the FairScheduler reserves
> resources on all nodes. This could lead to the entire cluster being
> unavailable for all other applications.
> Ideally, the reservations should be on a configurable number of nodes,
> default 1.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)