[
https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803866#comment-15803866
]
Robert Kanter commented on YARN-6050:
-------------------------------------
{quote}I think it might be better to hardcode executionType (guaranteed),
numContainers (1), priority (0). Correct?{quote}
My original thinking on this was that if we hardcode these, then it limits what
clients can do. Though I suppose things probably won't currently work
correctly if we don't force these values here; plus, numContainers and priority
are already hardcoded to these values elsewhere, so it doesn't matter. I'll
make this change in the next patch.
[~leftnoteasy], on your other points, I view this more as fixing a missing
piece of the Yarn Client API, rather than a feature. This can definitely help
with troubleshooting if you have a specific host you want to test something on.
However, even though this JIRA puts this missing functionality into the API,
it doesn't actually expose it to the end user; it's more of a framework-level
thing. A user submitting an MR, Spark, etc job doesn't currently have access
to this. If MR or Spark or whatever wants to take advantage of it, they'll
need to update how they use the yarn client, and provide some way for the user
to interface with it (e.g. a {{mapreduce.am.rack=foo}} property). Given that
the "customer" here is really the frameworks (which should be more familiar
with Yarn than the user, and can decide how much of this they want to expose to
the user), I don't think we need to add an extra config to enable/disable this
ability.
Plus, hard vs soft locality is already a boolean - it seems funny to add
another boolean to enable/disable that boolean :)
> AMs can't be scheduled on racks or nodes
> ----------------------------------------
>
> Key: YARN-6050
> URL: https://issues.apache.org/jira/browse/YARN-6050
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.9.0, 3.0.0-alpha2
> Reporter: Robert Kanter
> Assignee: Robert Kanter
> Attachments: YARN-6050.001.patch, YARN-6050.002.patch,
> YARN-6050.003.patch, YARN-6050.004.patch
>
>
> Yarn itself supports rack/node aware scheduling for AMs; however, there
> currently are two problems:
> # To specify hard or soft rack/node requests, you have to specify more than
> one {{ResourceRequest}}. For example, if you want to schedule an AM only on
> "rackA", you have to create two {{ResourceRequest}}, like this:
> {code}
> ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false);
> ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS,
> true);
> {code}
> The problem is that the Yarn API doesn't actually allow you to specify more
> than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}. The
> current behavior is to either build one from {{getResource}} or directly from
> {{getAMContainerResourceRequest}}, depending on if
> {{getAMContainerResourceRequest}} is null or not. We'll need to add a third
> method, say {{getAMContainerResourceRequests}}, which takes a list of
> {{ResourceRequest}} so that clients can specify the multiple resource
> requests.
> # There are some places where things are hardcoded to overwrite what the
> client specifies. These are pretty straightforward to fix.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]