[
https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703780#comment-14703780
]
Wangda Tan commented on YARN-2005:
----------------------------------
[~adhoot],
I think one possible solution is, we can add necessary field to
AppAttemptAddedSchedulerEvent, such as "lastAttemptState" and "AMNode", etc.
Which should be scheduler application/attempt to make decisions.
And another suggestion is, we may not need to create a separated
getNumClusterHosts(), using existing #NMs should be enough. We have rare case
that multiple NMs running in a same host, and even if there're multiple NMs
running, AM failure could still relate to specific NM config.
> Blacklisting support for scheduling AMs
> ---------------------------------------
>
> Key: YARN-2005
> URL: https://issues.apache.org/jira/browse/YARN-2005
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: resourcemanager
> Affects Versions: 0.23.10, 2.4.0
> Reporter: Jason Lowe
> Assignee: Anubhav Dhoot
> Attachments: YARN-2005.001.patch, YARN-2005.002.patch,
> YARN-2005.003.patch, YARN-2005.004.patch, YARN-2005.005.patch,
> YARN-2005.006.patch, YARN-2005.006.patch
>
>
> It would be nice if the RM supported blacklisting a node for an AM launch
> after the same node fails a configurable number of AM attempts. This would
> be similar to the blacklisting support for scheduling task attempts in the
> MapReduce AM but for scheduling AM attempts on the RM side.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)