[ https://issues.apache.org/jira/browse/YARN-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17907334#comment-17907334 ]
ASF GitHub Bot commented on YARN-11746: --------------------------------------- zhengchenyu opened a new pull request, #7239: URL: https://github.com/apache/hadoop/pull/7239 ### Description of PR I found that PublicLocalizer is exiting, because of the /tmp directory was deleted by mistake, then throw NPE, then causing spark job to be stuck. [YARN-9968](https://issues.apache.org/jira/browse/YARN-9968) have resolve the NPE problem. For me, I think when the `PublicLocalizer` thread exits, the NM should be shut down, because there is no point in keeping an abnormal NM running. ### How was this patch tested? manual test ### For code changes: - [x] When PublicLocalizer is exiting, shutdown the NodeManager. > NodeManager should shutdown when PublicLocalizer is exiting. > ------------------------------------------------------------ > > Key: YARN-11746 > URL: https://issues.apache.org/jira/browse/YARN-11746 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Chenyu Zheng > Assignee: Chenyu Zheng > Priority: Major > > I found that PublicLocalizer is exiting, because of the /tmp directory was > deleted by mistake, then throw NPE, then causing spark job to be stuck. > YARN-9968 have resolve the NPE problem. For me, I think when the `Public > Localizer` thread exits, the NM should be shut down, because there is no > point in keeping an abnormal NM running. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org