[jira] [Commented] (YARN-6050) AMs can't be scheduled on racks or nodes

Robert Kanter (JIRA) Thu, 05 Jan 2017 23:38:42 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803866#comment-15803866
 ]


Robert Kanter commented on YARN-6050:
-------------------------------------

{quote}I think it might be better to hardcode executionType (guaranteed), 
numContainers (1), priority (0). Correct?{quote}
My original thinking on this was that if we hardcode these, then it limits what 
clients can do.  Though I suppose things probably won't currently work 
correctly if we don't force these values here; plus, numContainers and priority 
are already hardcoded to these values elsewhere, so it doesn't matter.  I'll 
make this change in the next patch.

[~leftnoteasy], on your other points, I view this more as fixing a missing 
piece of the Yarn Client API, rather than a feature.  This can definitely help 
with troubleshooting if you have a specific host you want to test something on. 
 However, even though this JIRA puts this missing functionality into the API, 
it doesn't actually expose it to the end user; it's more of a framework-level 
thing.  A user submitting an MR, Spark, etc job doesn't currently have access 
to this.  If MR or Spark or whatever wants to take advantage of it, they'll 
need to update how they use the yarn client, and provide some way for the user 
to interface with it (e.g. a {{mapreduce.am.rack=foo}} property).  Given that 
the "customer" here is really the frameworks (which should be more familiar 
with Yarn than the user, and can decide how much of this they want to expose to 
the user), I don't think we need to add an extra config to enable/disable this 
ability.  

Plus, hard vs soft locality is already a boolean - it seems funny to add 
another boolean to enable/disable that boolean :)

> AMs can't be scheduled on racks or nodes
> ----------------------------------------
>
>                 Key: YARN-6050
>                 URL: https://issues.apache.org/jira/browse/YARN-6050
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.9.0, 3.0.0-alpha2
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: YARN-6050.001.patch, YARN-6050.002.patch, 
> YARN-6050.003.patch, YARN-6050.004.patch
>
>
> Yarn itself supports rack/node aware scheduling for AMs; however, there 
> currently are two problems:
> # To specify hard or soft rack/node requests, you have to specify more than 
> one {{ResourceRequest}}.  For example, if you want to schedule an AM only on 
> "rackA", you have to create two {{ResourceRequest}}, like this:
> {code}
> ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false);
> ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS, 
> true);
> {code}
> The problem is that the Yarn API doesn't actually allow you to specify more 
> than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}.  The 
> current behavior is to either build one from {{getResource}} or directly from 
> {{getAMContainerResourceRequest}}, depending on if 
> {{getAMContainerResourceRequest}} is null or not.  We'll need to add a third 
> method, say {{getAMContainerResourceRequests}}, which takes a list of 
> {{ResourceRequest}} so that clients can specify the multiple resource 
> requests.
> # There are some places where things are hardcoded to overwrite what the 
> client specifies.  These are pretty straightforward to fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-6050) AMs can't be scheduled on racks or nodes

Reply via email to