Junping Du commented on YARN-3223:

Hi [~brookz], thanks for updating the patch. The current approach sounds OK to 
me. Only one issue here is: there is time window between completedContainer() 
and RMNodeResourceUpdateEvent get handled. So if a scheduling effort happens 
within this window, the new container could still get allocated on this node. 
Even worse case is if scheduling effort happen after RMNodeResourceUpdateEvent 
sent out but before it propagated to SchedulerNode, then you will find the 
total resource is lower than used resource and available resource is a negative 
IMO, a safer way is: besides your existing RMNodeResourceUpdateEvent update, in 
completedContainer() for decommissioning nodes, we can hold on adding back 
availableResource in SchedulerNode, but continue to deduct usedResource. At 
this moment, SchedulerNode's total resource will be lower than usedResource + 
availableResource, but it will soon corrected after RMNodeResourceUpdateEvent 
comes. How does this sound?

> Resource update during NM graceful decommission
> -----------------------------------------------
>                 Key: YARN-3223
>                 URL: https://issues.apache.org/jira/browse/YARN-3223
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: graceful, nodemanager, resourcemanager
>    Affects Versions: 2.7.1
>            Reporter: Junping Du
>            Assignee: Brook Zhou
>         Attachments: YARN-3223-v0.patch, YARN-3223-v1.patch, 
> YARN-3223-v2.patch, YARN-3223-v3.patch
> During NM graceful decommission, we should handle resource update properly, 
> include: make RMNode keep track of old resource for possible rollback, keep 
> available resource to 0 and used resource get updated when
> container finished.

This message was sent by Atlassian JIRA

Reply via email to