[
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541881#comment-14541881
]
Jason Lowe commented on YARN-41:
--------------------------------
In light of NM restart, one of the problems with having the NM check for active
applications and then take different actions is that the NM has a significantly
delayed view of the cluster relative to the RM. The RM could have decided to
assign new containers (and thus new applications) to the node, but the NM
hasn't seen the launch request from the AM yet. This has already caused other
issues, see the early discussions in YARN-3535 where containers were killed
because the node reconnected with no active applications reported and was
handled as a node removed/node added sequence.
> The RM should handle the graceful shutdown of the NM.
> -----------------------------------------------------
>
> Key: YARN-41
> URL: https://issues.apache.org/jira/browse/YARN-41
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: nodemanager, resourcemanager
> Reporter: Ravi Teja Ch N V
> Assignee: Devaraj K
> Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch,
> MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch,
> YARN-41-4.patch, YARN-41.patch
>
>
> Instead of waiting for the NM expiry, RM should remove and handle the NM,
> which is shutdown gracefully.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)