[
https://issues.apache.org/jira/browse/YARN-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068420#comment-15068420
]
Junping Du commented on YARN-3223:
----------------------------------
Hi [~brookz], thanks for updating the patch. The current approach sounds OK to
me. Only one issue here is: there is time window between completedContainer()
and RMNodeResourceUpdateEvent get handled. So if a scheduling effort happens
within this window, the new container could still get allocated on this node.
Even worse case is if scheduling effort happen after RMNodeResourceUpdateEvent
sent out but before it propagated to SchedulerNode, then you will find the
total resource is lower than used resource and available resource is a negative
value.
IMO, a safer way is: besides your existing RMNodeResourceUpdateEvent update, in
completedContainer() for decommissioning nodes, we can hold on adding back
availableResource in SchedulerNode, but continue to deduct usedResource. At
this moment, SchedulerNode's total resource will be lower than usedResource +
availableResource, but it will soon corrected after RMNodeResourceUpdateEvent
comes. How does this sound?
> Resource update during NM graceful decommission
> -----------------------------------------------
>
> Key: YARN-3223
> URL: https://issues.apache.org/jira/browse/YARN-3223
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: graceful, nodemanager, resourcemanager
> Affects Versions: 2.7.1
> Reporter: Junping Du
> Assignee: Brook Zhou
> Attachments: YARN-3223-v0.patch, YARN-3223-v1.patch,
> YARN-3223-v2.patch, YARN-3223-v3.patch
>
>
> During NM graceful decommission, we should handle resource update properly,
> include: make RMNode keep track of old resource for possible rollback, keep
> available resource to 0 and used resource get updated when
> container finished.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)