Ming Ma commented on YARN-90:

Thanks, Varun.

The main question about UNHEALTHY state is whether this patch might make it 
more likely for a node to become unhealthy given "full disk" has been added as 
one of the conditions. Given [~jira.shegalov]'s YARN-1996 and [~sjlee0]'s 
MAPREDUCE-5817 have suggestions to mitigate the impact of UNHEALTHY nodes on 
existing containers and MR task scheduling, this might not be an issue.

Nit: For "Set<String> postCheckFullDirs = new HashSet<String>(fullDirs);". It 
doesn't have to create postCheckFullDirs. It can directly refer to fullDirs 

> NodeManager should identify failed disks becoming good back again
> -----------------------------------------------------------------
>                 Key: YARN-90
>                 URL: https://issues.apache.org/jira/browse/YARN-90
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>            Reporter: Ravi Gummadi
>            Assignee: Varun Vasudev
>         Attachments: YARN-90.1.patch, YARN-90.patch, YARN-90.patch, 
> YARN-90.patch, YARN-90.patch, apache-yarn-90.0.patch, apache-yarn-90.1.patch, 
> apache-yarn-90.2.patch, apache-yarn-90.3.patch, apache-yarn-90.4.patch, 
> apache-yarn-90.5.patch, apache-yarn-90.6.patch, apache-yarn-90.7.patch, 
> apache-yarn-90.8.patch
> MAPREDUCE-3121 makes NodeManager identify disk failures. But once a disk goes 
> down, it is marked as failed forever. To reuse that disk (after it becomes 
> good), NodeManager needs restart. This JIRA is to improve NodeManager to 
> reuse good disks(which could be bad some time back).

This message was sent by Atlassian JIRA

Reply via email to