[ 
https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266692#comment-14266692
 ] 

Ming Ma commented on YARN-914:
------------------------------

Thanks, Junping. The timeout is definitely necessary.

* Sounds like we need a new state for NM, called "decommission_in_progress" 
when NM is draining the containers. When RM considers the decommission 
completes, it will be marked "decommissioned".

* To clarify my early comment "all its map output are fetched or until all the 
applications the node touches have completed", the question is when YARN can 
declare a node's state has been gracefully drained and thus the node gracefully 
decommissioned ( admins can shutdown the whole machine without any impact on 
jobs ). For MR, the state could be running tasks/containers or mapper outputs. 
Say we have timeout of 30 minutes for decommission, it takes 3 minutes to 
finish the mappers on the node, another 5 minutes for the job to finish, then 
YARN can declare the node gracefully decommissioned in 8 minutes, instead of 
waiting for 30 minutes. RM knows all applications on any given NM. So if all 
applications on any given node have completed, RM can mark the node 
"decommissioned".

* Yes, I meant long running services. If YARN just kills the containers upon 
decommission request, the impact could vary. Some services might not have 
states to drain. Or maybe the services can handle the state migration on their 
own without YARN's help. For such services, maybe we can just use 
ResourceOption's timeout for that; set timeout to 0 and NM will just kill the 
containers.

* Given we don't plan to have applications checkpoint and migrate states, it 
doesn't seem to be necessary to have YARN notify applications upon decommission 
requests. Just to call it out.

* It might be useful to have a new state called "decommissioned_timeout", so 
that admins know the node has been gracefully decommissioned or not.

Thoughts?

> Support graceful decommission of nodemanager
> --------------------------------------------
>
>                 Key: YARN-914
>                 URL: https://issues.apache.org/jira/browse/YARN-914
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 2.0.4-alpha
>            Reporter: Luke Lu
>            Assignee: Junping Du
>
> When NMs are decommissioned for non-fault reasons (capacity change etc.), 
> it's desirable to minimize the impact to running applications.
> Currently if a NM is decommissioned, all running containers on the NM need to 
> be rescheduled on other NMs. Further more, for finished map tasks, if their 
> map output are not fetched by the reducers of the job, these map tasks will 
> need to be rerun as well.
> We propose to introduce a mechanism to optionally gracefully decommission a 
> node manager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to