[ https://issues.apache.org/jira/browse/YARN-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068420#comment-15068420 ]
Junping Du commented on YARN-3223: ---------------------------------- Hi [~brookz], thanks for updating the patch. The current approach sounds OK to me. Only one issue here is: there is time window between completedContainer() and RMNodeResourceUpdateEvent get handled. So if a scheduling effort happens within this window, the new container could still get allocated on this node. Even worse case is if scheduling effort happen after RMNodeResourceUpdateEvent sent out but before it propagated to SchedulerNode, then you will find the total resource is lower than used resource and available resource is a negative value. IMO, a safer way is: besides your existing RMNodeResourceUpdateEvent update, in completedContainer() for decommissioning nodes, we can hold on adding back availableResource in SchedulerNode, but continue to deduct usedResource. At this moment, SchedulerNode's total resource will be lower than usedResource + availableResource, but it will soon corrected after RMNodeResourceUpdateEvent comes. How does this sound? > Resource update during NM graceful decommission > ----------------------------------------------- > > Key: YARN-3223 > URL: https://issues.apache.org/jira/browse/YARN-3223 > Project: Hadoop YARN > Issue Type: Sub-task > Components: graceful, nodemanager, resourcemanager > Affects Versions: 2.7.1 > Reporter: Junping Du > Assignee: Brook Zhou > Attachments: YARN-3223-v0.patch, YARN-3223-v1.patch, > YARN-3223-v2.patch, YARN-3223-v3.patch > > > During NM graceful decommission, we should handle resource update properly, > include: make RMNode keep track of old resource for possible rollback, keep > available resource to 0 and used resource get updated when > container finished. -- This message was sent by Atlassian JIRA (v6.3.4#6332)