Hou Song created YARN-1380:
------------------------------

             Summary: Enable NM to automatically reuse failed local dirs after 
they are available again
                 Key: YARN-1380
                 URL: https://issues.apache.org/jira/browse/YARN-1380
             Project: Hadoop YARN
          Issue Type: New Feature
          Components: nodemanager
            Reporter: Hou Song


Currently NM is able to kick bad directories out when they fail, but not able 
to reuse them if they are fixed. This is inconvenient in large production 
clusters. 
In this jira I propose a patch that I am using in my organization. 
It also adds a new metric of the number of failed directories so people have 
clearer view from outside. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to