[ 
https://issues.apache.org/jira/browse/YARN-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated YARN-1996:
--------------------------------

    Attachment: YARN-1996.v01.patch

> Provide alternative policies for UNHEALTHY nodes.
> -------------------------------------------------
>
>                 Key: YARN-1996
>                 URL: https://issues.apache.org/jira/browse/YARN-1996
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager, scheduler
>    Affects Versions: 2.4.0
>            Reporter: Gera Shegalov
>            Assignee: Gera Shegalov
>         Attachments: YARN-1996.v01.patch
>
>
> Currently, UNHEALTHY nodes can significantly prolong execution of large 
> expensive jobs as demonstrated by MAPREDUCE-5817, and downgrade the cluster 
> health even further due to [positive 
> feedback|http://en.wikipedia.org/wiki/Positive_feedback]. A container set 
> that might have deemed the node unhealthy in the first place starts spreading 
> across the cluster because the current node is declared unusable and all its 
> containers are killed and rescheduled on different nodes.
> To mitigate this, we experiment with a patch that allows containers already 
> running on a node turning UNHEALTHY to complete (drain) whereas no new 
> container can be assigned to it until it turns healthy again.
> This mechanism can also be used for graceful decommissioning of NM. To this 
> end, we have to write a health script  such that it can deterministically 
> report UNHEALTHY. For example with 
> {code}
> if [ -e $1 ] ; then                                                           
>      
>   echo ERROR Node decommmissioning via health script hack                     
>      
> fi 
> {code}
> In the current version patch, the behavior is controlled by a boolean 
> property {{yarn.nodemanager.unheathy.drain.containers}}. More versatile 
> policies are possible in the future work. Currently, the health state of a 
> node is binary determined based on the disk checker and the health script 
> ERROR outputs. However, we can as well interpret health script output similar 
> to java logging levels (one of which is ERROR) such as WARN, FATAL. Each 
> level can then be treated differently. E.g.,
> - FATAL:  unusable like today 
> - ERROR: drain
> - WARN: halve the node capacity.
> complimented with some equivalence rules such as 3 WARN messages == ERROR,  
> 2*ERROR == FATAL, etc. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to