[
https://issues.apache.org/jira/browse/YARN-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14280176#comment-14280176
]
Karthik Kambatla commented on YARN-2990:
----------------------------------------
Discussed this with Sandy offline. We agreed on the following approach when
assigning work from an app to a node:
# Consider the app's node-local requests for that node.
# If there are no node-local requests for the app (for any node in the
cluster), or if the allowed locality level is rack-local or off-switch,
consider the app's rack-local requests for that rack.
# If there are no node-local or rack-local requests (for any node or rack), or
if the allowed locality level is off-switch, consider the app's off-switch
requests.
Attached is a patch (v0) that works for off-switch requests. I tested the patch
both using the unit test and on a cluster, and there is no longer a delay to
launch the AM container.
To handle the rack-local requests, we should be able to differentiate between
node and rack names in the data-structure that holds all the ResourceRequests.
Or, we could keep counters when adding RRs and assigning containers.
> FairScheduler's delay-scheduling always waits for node-local and rack-local
> delays, even for off-rack-only requests
> -------------------------------------------------------------------------------------------------------------------
>
> Key: YARN-2990
> URL: https://issues.apache.org/jira/browse/YARN-2990
> Project: Hadoop YARN
> Issue Type: Bug
> Components: fairscheduler
> Affects Versions: 2.6.0
> Reporter: Karthik Kambatla
> Assignee: Karthik Kambatla
> Attachments: yarn-2990-0.patch, yarn-2990-test.patch
>
>
> Looking at the FairScheduler, it appears the node/rack locality delays are
> used for all requests, even those that are only off-rack.
> More details in comments.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)