[ https://issues.apache.org/jira/browse/YARN-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16598852#comment-16598852 ]
Eric Badger commented on YARN-8729: ----------------------------------- Been awhile since I worked on that patch, but I think the idea was that we shouldn't say that the service is running until it's finished starting up. Setting isStopped to false before the whole NM startup has finished would introduce a race condition where the NM says it's running, but it isn't fully up yet. Looking at the code, I'm not sure if there is a functional reason, since the statusUpdater thread loops on isStopped until it's false. > Node status updater thread could be lost after it restarted > ----------------------------------------------------------- > > Key: YARN-8729 > URL: https://issues.apache.org/jira/browse/YARN-8729 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 3.2.0 > Reporter: Tao Yang > Assignee: Tao Yang > Priority: Critical > Attachments: YARN-8729.001.patch > > > Today I found a lost NM whose node status updater thread was not exist after > this thread restarted. In > {{NodeStatusUpdaterImpl#rebootNodeStatusUpdaterAndRegisterWithRM}}, isStopped > flag is not updated to be false before executing {{statusUpdater.start()}}, > so that if the thread is immediately started and found isStopped==true, it > will exit without any log. > Key codes in > {{NodeStatusUpdaterImpl#rebootNodeStatusUpdaterAndRegisterWithRM}}: > {code:java} > statusUpdater.join(); > registerWithRM(); > statusUpdater = new Thread(statusUpdaterRunnable, "Node Status Updater"); > statusUpdater.start(); > this.isStopped = false; //this line should be moved before > statusUpdater.start(); > LOG.info("NodeStatusUpdater thread is reRegistered and restarted"); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org