[ 
https://issues.apache.org/jira/browse/YARN-11421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18030777#comment-18030777
 ] 

ASF GitHub Bot commented on YARN-11421:
---------------------------------------

github-actions[bot] closed pull request #5905: [YARN-11421] Graceful 
Decommission ignores launched containers and gets deactivated before timeout
URL: https://github.com/apache/hadoop/pull/5905




> Graceful Decommission ignores launched containers and gets deactivated before 
> timeout
> -------------------------------------------------------------------------------------
>
>                 Key: YARN-11421
>                 URL: https://issues.apache.org/jira/browse/YARN-11421
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 3.2.1, 3.3.1, 3.3.4
>            Reporter: Abhishek Dixit
>            Priority: Major
>              Labels: pull-request-available
>
> During Graceful Decommission, a Node gets deactivated before timeout even 
> though there are launched containers on that node.
> We have observed cases when graceful decommission signal is sent to node and 
> Containers are launched at NodeManager and at the same time, in such cases 
> ResourceManager moves the node from Decommissioning to Decommissioned state 
> because launced containers are not checked in DecommissioningNodesWatcher.
> We will suggest waiting for 
> yarn.resourcemanager.decommissioning-nodes-watcher.delay-ms to complete 
> before marking node ready to be decommissioned. No delay if set to 0. Expire 
> interval should not be configured more than RM_AM_EXPIRY_INTERVAL_MS.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to