[ 
https://issues.apache.org/jira/browse/YARN-9721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900727#comment-16900727
 ] 

Zac Zhou commented on YARN-9721:
--------------------------------

 

[~sunilg]

Thanks a lot for your comments~

Maybe I could use some methods to clean up the inactive list.
 # Add a parameter like "--prune-nodes", to the command "rmadmin 
-refreshNodes". A parameter named like "prunable" can be added to RMNodes. when 
"rmadmin -refreshNodes --prune-nodes" is executed.  prunable of RMNodes should 
be true, and RMNodes will deleted by removalTimer.
 # Add a time period parameter in yarn configuration. If RMNodes stays in the 
inactive list more than that time period, delete the RMNodes.
 # Add a parameter in yarn configuration. If the parameter is true. Delete the 
RMNodes from the inactive list directly.

[~sunilg], [~leftnoteasy], [~cheersyang], [~tangzhankun] Any Ideas~

 

> An easy method to exclude a nodemanager from the yarn cluster cleanly
> ---------------------------------------------------------------------
>
>                 Key: YARN-9721
>                 URL: https://issues.apache.org/jira/browse/YARN-9721
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Zac Zhou
>            Priority: Major
>         Attachments: decommission nodes.png
>
>
> If we want to take offline a nodemanager server, nodes.exclude-path
>  and "rmadmin -refreshNodes" command are used to decommission the server.
>  But this method cannot clean up the node clearly. Nodemanager servers are 
> still in Decommissioned Nodes as the attachment shows.
>   !decommission nodes.png!
> YARN-4311 enable a removalTimer to clean up the untracked node.
>  But the logic of isUntrackedNode method is to restrict. If include-path is 
> not used, no servers can meet the criteria. Using an include file would make 
> a potential risk in maintenance.
> If yarn cluster is installed on cloud, nodemanager servers are created and 
> deleted frequently. We need a way to exclude a nodemanager from the yarn 
> cluster cleanly. Otherwise, the map of rmContext.getInactiveRMNodes() would 
> keep growing, which would cause a memory issue of RM.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to