[ 
https://issues.apache.org/jira/browse/YARN-558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13820113#comment-13820113
 ] 

Junping Du commented on YARN-558:
---------------------------------

+1 on use case in cloud services. I think one feasible way to achieve this 
(although not convenient) now is:
- first, decommission nodes by putting them on decommission list and call 
refreshNodes().
- then, wait at least one heartbeat() of each nodes to make sure decommissioned 
nodes are clear
- at last, remove nodes from decommission list and refreshNodes() again.
We do need something simpler.

> Add ability to completely remove nodemanager from resourcemanager.
> ------------------------------------------------------------------
>
>                 Key: YARN-558
>                 URL: https://issues.apache.org/jira/browse/YARN-558
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Garth Goodson
>            Priority: Minor
>              Labels: feature
>
> I would like to add the ability to completely remove a nodemanager from the 
> resourcemanager's state.
> I run a cloud service where I want to dynamically bring up nodes to act as 
> nodemanagers and then bring them down again when not needed.  These nodes 
> have dynamically assigned IPs, thus the alternative of decommissioning them 
> via an excludes file leads to a large (unbounded) list of decommissioned 
> nodes that may never be commissioned again. I would like the ability to move 
> a node from a decommissioned state to completely removing it from the 
> resource manager.
> I have thought of two ways of implementing this.
> 1) Add an optional timeout between the decommission state -> being removed 
> from the nodemanager.
> 2) Add an explicit RPC to remove a node that is decommissioned.
> Any additional thoughts/discussion are welcome.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to