[ https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541881#comment-14541881 ]
Jason Lowe commented on YARN-41: -------------------------------- In light of NM restart, one of the problems with having the NM check for active applications and then take different actions is that the NM has a significantly delayed view of the cluster relative to the RM. The RM could have decided to assign new containers (and thus new applications) to the node, but the NM hasn't seen the launch request from the AM yet. This has already caused other issues, see the early discussions in YARN-3535 where containers were killed because the node reconnected with no active applications reported and was handled as a node removed/node added sequence. > The RM should handle the graceful shutdown of the NM. > ----------------------------------------------------- > > Key: YARN-41 > URL: https://issues.apache.org/jira/browse/YARN-41 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, resourcemanager > Reporter: Ravi Teja Ch N V > Assignee: Devaraj K > Labels: BB2015-05-TBR > Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, > MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, > YARN-41-4.patch, YARN-41.patch > > > Instead of waiting for the NM expiry, RM should remove and handle the NM, > which is shutdown gracefully. -- This message was sent by Atlassian JIRA (v6.3.4#6332)