[ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541881#comment-14541881
 ] 

Jason Lowe commented on YARN-41:
--------------------------------

In light of NM restart, one of the problems with having the NM check for active 
applications and then take different actions is that the NM has a significantly 
delayed view of the cluster relative to the RM.  The RM could have decided to 
assign new containers (and thus new applications) to the node, but the NM 
hasn't seen the launch request from the AM yet.  This has already caused other 
issues, see the early discussions in YARN-3535 where containers were killed 
because the node reconnected with no active applications reported and was 
handled as a node removed/node added sequence.

> The RM should handle the graceful shutdown of the NM.
> -----------------------------------------------------
>
>                 Key: YARN-41
>                 URL: https://issues.apache.org/jira/browse/YARN-41
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager, resourcemanager
>            Reporter: Ravi Teja Ch N V
>            Assignee: Devaraj K
>              Labels: BB2015-05-TBR
>         Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
> MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
> YARN-41-4.patch, YARN-41.patch
>
>
> Instead of waiting for the NM expiry, RM should remove and handle the NM, 
> which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to