[ https://issues.apache.org/jira/browse/YARN-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13708537#comment-13708537 ]
Jason Lowe commented on YARN-917: --------------------------------- I think one way to solve this is to move the removal of the staging directory to *after* we unregister from the RM. Now that there's a FINISHING state that gives the app a grace period to finish cleanly, we leverage this to remove the staging directory after unregistering. This should solve some other races related to removal of the staging directory and unregistering (e.g.: AM crashes after removing staging directory but before unregistering). > Job can fail when RM restarts after staging dir is cleaned but before MR > successfully unregister with RM > -------------------------------------------------------------------------------------------------------- > > Key: YARN-917 > URL: https://issues.apache.org/jira/browse/YARN-917 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Reporter: Jian He > Assignee: Jian He > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira