[ 
https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-6050:
--------------------------------
    Attachment: YARN-6050.007.patch

The 007 patch:
- Integrates the AM blacklisting feature by setting the number of nodes for it 
to consider for the threshold to the number of "eligible" nodes instead of all 
of the nodes (i.e. in the case of a strict locality, only the nodes that 
qualify).  For example, if you put a strict locality on the "/foo" rack, it 
will only count nodes on that rack.  And if you put a strict locality on the 
"bar" node, it will only count that one node.
- New tests for the above
- I had to add a new method {{getNumClusterNodesByResourceName}} to the 
{{YarnScheduler}} interface, similar to the existing {{getNumClusterNodes}}.  
-- I wasn't sure what annotations to put, so please check this.
-- While I was here, I also moved the implementation for {{getNumClusterNodes}} 
to {{AbstractYarnScheduler}} instead of in each of the other schedulers because 
it was duplicate code
- I also did some more manual testing on a real cluster with various options 
(different locality requirements, am blacklisting, etc)

[~leftnoteasy], please take a look at the new patch.  Thanks.

> AMs can't be scheduled on racks or nodes
> ----------------------------------------
>
>                 Key: YARN-6050
>                 URL: https://issues.apache.org/jira/browse/YARN-6050
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.9.0, 3.0.0-alpha2
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: YARN-6050.001.patch, YARN-6050.002.patch, 
> YARN-6050.003.patch, YARN-6050.004.patch, YARN-6050.005.patch, 
> YARN-6050.006.patch, YARN-6050.007.patch
>
>
> Yarn itself supports rack/node aware scheduling for AMs; however, there 
> currently are two problems:
> # To specify hard or soft rack/node requests, you have to specify more than 
> one {{ResourceRequest}}.  For example, if you want to schedule an AM only on 
> "rackA", you have to create two {{ResourceRequest}}, like this:
> {code}
> ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false);
> ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS, 
> true);
> {code}
> The problem is that the Yarn API doesn't actually allow you to specify more 
> than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}.  The 
> current behavior is to either build one from {{getResource}} or directly from 
> {{getAMContainerResourceRequest}}, depending on if 
> {{getAMContainerResourceRequest}} is null or not.  We'll need to add a third 
> method, say {{getAMContainerResourceRequests}}, which takes a list of 
> {{ResourceRequest}} so that clients can specify the multiple resource 
> requests.
> # There are some places where things are hardcoded to overwrite what the 
> client specifies.  These are pretty straightforward to fix.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to