[ 
https://issues.apache.org/jira/browse/YARN-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15203367#comment-15203367
 ] 

Sunil G commented on YARN-4636:
-------------------------------

As YARN improves in its blacklist/whitelist node functionality, one of the 
major usecase from our end is to save the second/further AM Container launch 
attempts to same failed node (if this is failed in a node due to external 
environment/memory issues). This can really help us. With YARN-2005, we have a 
mechanism in hand. And there were concerns on its strict behavior. Proposal 
made in YARN-4837 helps in straighten things out for immediate 2.8.

 I think YARN-4576 was trying to improve on current YARN-2005 and trying to 
generalize it. As we are going forward, if we are planning for a global 
blacklisting based various type of container exit codes, then policy can be 
helpful assuming that we may have different type of apps. For this scenario, we 
do not have usecases from our end. I checked with [~rohithsharma] and 
[~Naganarasimha Garla] also for this. It will be good if we can 
discuss/retrospect more on *global blacklisting* and its advantages/limitations 
based on current available information from containers exit codes.

> Make blacklist tracking policy pluggable for more extensions.
> -------------------------------------------------------------
>
>                 Key: YARN-4636
>                 URL: https://issues.apache.org/jira/browse/YARN-4636
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Junping Du
>            Assignee: Sunil G
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to