[
https://issues.apache.org/jira/browse/YARN-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14970078#comment-14970078
]
Wangda Tan commented on YARN-3223:
----------------------------------
[~brookz],
Thanks for working on this JIRA, I took a look at the patch, some comments:
I think the general approach is fine, SchedulerNode keep updating total
resource to used resource if it is decommissioning state, so available resource
will be 0.
But I think there're some other places need to take care:
- Suggest to use {{CapacityScheduler#updateNodeAndQueueResource}} to update
resources, we need to update queue's resource, cluster metrics as well.
- When async scheduling enabled, we need to make sure decommissioing node's
total resource is updated so no new container will be allocated on these nodes.
And after this patch, I think we need to add total decommisioning nodes
resources to cluster metrics (equals to sum(decommisioning-node.used-resource)).
> Resource update during NM graceful decommission
> -----------------------------------------------
>
> Key: YARN-3223
> URL: https://issues.apache.org/jira/browse/YARN-3223
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager, resourcemanager
> Affects Versions: 2.7.1
> Reporter: Junping Du
> Assignee: Brook Zhou
> Attachments: YARN-3223-v0.patch, YARN-3223-v1.patch
>
>
> During NM graceful decommission, we should handle resource update properly,
> include: make RMNode keep track of old resource for possible rollback, keep
> available resource to 0 and used resource get updated when
> container finished.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)