[jira] [Commented] (YARN-914) (Umbrella) Support graceful decommission of nodemanager

Parvez (JIRA) Thu, 17 Sep 2015 10:05:21 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14803230#comment-14803230
 ]


Parvez commented on YARN-914:
-----------------------------

Hi,

I am facing issues when trying to resize the AWS EMR cluster which is 
configured with Hadoop 2.6.0

Resizing works fine, but when decommissioning a node which has containers 
running in it, the entire emr cluster stops functioning. On a resize request, 
the EMR terminates a Task Node (EC2 instance ) randomly, without checking if it 
has containers running in it or not. 

Here YARN should perform moving the containers and the job from one node to 
another, which it isnt doing I suppose .

Could it be related to the issue listed here ? 

Please answer. Thank you. 

> (Umbrella) Support graceful decommission of nodemanager
> -------------------------------------------------------
>
>                 Key: YARN-914
>                 URL: https://issues.apache.org/jira/browse/YARN-914
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 2.0.4-alpha
>            Reporter: Luke Lu
>            Assignee: Junping Du
>         Attachments: Gracefully Decommission of NodeManager (v1).pdf, 
> Gracefully Decommission of NodeManager (v2).pdf, 
> GracefullyDecommissionofNodeManagerv3.pdf
>
>
> When NMs are decommissioned for non-fault reasons (capacity change etc.), 
> it's desirable to minimize the impact to running applications.
> Currently if a NM is decommissioned, all running containers on the NM need to 
> be rescheduled on other NMs. Further more, for finished map tasks, if their 
> map output are not fetched by the reducers of the job, these map tasks will 
> need to be rerun as well.
> We propose to introduce a mechanism to optionally gracefully decommission a 
> node manager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-914) (Umbrella) Support graceful decommission of nodemanager

Reply via email to