[
https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135138#comment-15135138
]
Daniel Zhi commented on YARN-914:
---------------------------------
I have applied and merged my code changes on top of latest Hadoop trunk branch
(3.0.0-SNAPSHOT), launched cluster and verified graceful decommission works as
expected. Per suggestion, I created a sub-JIRA with a doc that describes the
design and the patch on top of latest trunk.
> (Umbrella) Support graceful decommission of nodemanager
> -------------------------------------------------------
>
> Key: YARN-914
> URL: https://issues.apache.org/jira/browse/YARN-914
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: graceful
> Affects Versions: 2.0.4-alpha
> Reporter: Luke Lu
> Assignee: Junping Du
> Attachments: Gracefully Decommission of NodeManager (v1).pdf,
> Gracefully Decommission of NodeManager (v2).pdf,
> GracefullyDecommissionofNodeManagerv3.pdf
>
>
> When NMs are decommissioned for non-fault reasons (capacity change etc.),
> it's desirable to minimize the impact to running applications.
> Currently if a NM is decommissioned, all running containers on the NM need to
> be rescheduled on other NMs. Further more, for finished map tasks, if their
> map output are not fetched by the reducers of the job, these map tasks will
> need to be rerun as well.
> We propose to introduce a mechanism to optionally gracefully decommission a
> node manager.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)