[ 
https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621057#comment-14621057
 ] 

Sunil G commented on YARN-2005:
-------------------------------

Hi [~adhoot]
Thank you for sharing patch for same. I have couple of doubts.

- DEFAULT_FAILURE_THRESHOLD
Now default is 0.8, I feel we can keep this as a configurable limit. Based on 
node size, i feel user can decide till which threshold we can support AM 
blacklisting.

- Below code from CS#allocate
{code}
   application.updateBlacklist(blacklistAdditions, blacklistRemovals);
{code}
Assume a case where app1 AM is running in {{node1}}. Due to a failure there, 
app is relaunched in {{node2}} and {{node1}} is marked for blacklisting by 
SimpleBlacklistManager.
Since  node1 is added as blacklisted, all containers of this app will be 
blacklisted in node1. Is this intended, Please correct me if I am wrong.

> Blacklisting support for scheduling AMs
> ---------------------------------------
>
>                 Key: YARN-2005
>                 URL: https://issues.apache.org/jira/browse/YARN-2005
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 0.23.10, 2.4.0
>            Reporter: Jason Lowe
>            Assignee: Anubhav Dhoot
>         Attachments: YARN-2005.001.patch, YARN-2005.002.patch, 
> YARN-2005.003.patch
>
>
> It would be nice if the RM supported blacklisting a node for an AM launch 
> after the same node fails a configurable number of AM attempts.  This would 
> be similar to the blacklisting support for scheduling task attempts in the 
> MapReduce AM but for scheduling AM attempts on the RM side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to