[ 
https://issues.apache.org/jira/browse/YARN-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17907334#comment-17907334
 ] 

ASF GitHub Bot commented on YARN-11746:
---------------------------------------

zhengchenyu opened a new pull request, #7239:
URL: https://github.com/apache/hadoop/pull/7239

   ### Description of PR
   
   I found that PublicLocalizer is exiting, because of the /tmp directory was 
deleted by mistake, then throw NPE, then causing spark job to be stuck.
   
   [YARN-9968](https://issues.apache.org/jira/browse/YARN-9968) have resolve 
the NPE problem. For me, I think when the `PublicLocalizer` thread exits, the 
NM should be shut down, because there is no point in keeping an abnormal NM 
running.
   
   ### How was this patch tested?
   
   manual test
   
   ### For code changes:
   
   - [x] When PublicLocalizer is exiting, shutdown the NodeManager.
   




> NodeManager should shutdown when PublicLocalizer is exiting.
> ------------------------------------------------------------
>
>                 Key: YARN-11746
>                 URL: https://issues.apache.org/jira/browse/YARN-11746
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Chenyu Zheng
>            Assignee: Chenyu Zheng
>            Priority: Major
>
> I found that PublicLocalizer is exiting, because of the /tmp directory was 
> deleted by mistake, then throw NPE, then causing spark job to be stuck.
> YARN-9968 have resolve the NPE problem. For me, I think when the `Public 
> Localizer` thread exits, the NM should be shut down, because there is no 
> point in keeping an abnormal NM running.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to