[ 
https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200490#comment-15200490
 ] 

Vinod Kumar Vavilapalli commented on YARN-4837:
-----------------------------------------------

Here are my concerns
 - First up the feature isn't 'AM blacklisting' - we are not blacklisting AMs. 
The goal is for the system to not schedule AMs on faulty nodes. The right 
solution is to identify why we keep launching on bad-nodes instead of marking 
them unhealthy - but I can see why a blacklist threshold is useful when we 
*simply don't know*.
 - The configurations are all named yarn.am.blacklisting even though they 
should be under a yarn.resourcemanager hierarchy
 - We just blindly add a node to the app's blacklist even if we just hit *one* 
AM failure. And the error / exit-code doesn't matter at all.
 - Irrespective of all that, I actually don't see why we should already expose 
this to end-users i.e the whole premise of YARN-4389. Why should an app 
specifically care "the number of nodes YARN blacklists for my AM container 
launch"?

I'm digging into the feature more for a careful look.

/cc
 - [~adhoot], [~jlowe], [~kasha] who were involved with YARN-2005 for the 
naming changes
 - [~sunilg] / [~djp] who worked on YARN-4389.

While we discuss this, I think we should take the private feature before 2.8.0 
goes out.

> User facing aspects of 'AM blacklisting' feature need fixing
> ------------------------------------------------------------
>
>                 Key: YARN-4837
>                 URL: https://issues.apache.org/jira/browse/YARN-4837
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Vinod Kumar Vavilapalli
>
> Was reviewing the user-facing aspects that we are releasing as part of 2.8.0.
> Looking at the 'AM blacklisting feature', I see several things to be fixed 
> before we release it in 2.8.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to