[ 
https://issues.apache.org/jira/browse/YARN-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16598852#comment-16598852
 ] 

Eric Badger commented on YARN-8729:
-----------------------------------

Been awhile since I worked on that patch, but I think the idea was that we 
shouldn't say that the service is running until it's finished starting up. 
Setting isStopped to false before the whole NM startup has finished would 
introduce a race condition where the NM says it's running, but it isn't fully 
up yet. Looking at the code, I'm not sure if there is a functional reason, 
since the statusUpdater thread loops on isStopped until it's false. 

> Node status updater thread could be lost after it restarted
> -----------------------------------------------------------
>
>                 Key: YARN-8729
>                 URL: https://issues.apache.org/jira/browse/YARN-8729
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 3.2.0
>            Reporter: Tao Yang
>            Assignee: Tao Yang
>            Priority: Critical
>         Attachments: YARN-8729.001.patch
>
>
> Today I found a lost NM whose node status updater thread was not exist after 
> this thread restarted. In 
> {{NodeStatusUpdaterImpl#rebootNodeStatusUpdaterAndRegisterWithRM}}, isStopped 
> flag is not updated to be false before executing {{statusUpdater.start()}}, 
> so that if the thread is immediately started and found isStopped==true, it 
> will exit without any log.
> Key codes in 
> {{NodeStatusUpdaterImpl#rebootNodeStatusUpdaterAndRegisterWithRM}}:
> {code:java}
>  statusUpdater.join();
>  registerWithRM();
>  statusUpdater = new Thread(statusUpdaterRunnable, "Node Status Updater");
>  statusUpdater.start();
>  this.isStopped = false;   //this line should be moved before 
> statusUpdater.start();
>  LOG.info("NodeStatusUpdater thread is reRegistered and restarted");
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to