Brook Zhou commented on YARN-3223:

Thanks [~djp] for the feedback. 

Those scenarios mentioned are indeed problematic. I think the proposal would 
end up making some changes to SchedulerNode and add more complexity there. It 
could end up being too much overhead in terms of maintaining more variables, 
and will still not solve the issues entirely due to the system still being only 
eventually consistent. 

Since CapacityScheduler.nodeUpdate is already synchronized, if we eliminated 
using the asynchronous RMNodeResourceUpdateEvent and just directly modify the 
decommissioning SchedulerNode using updateNodeAndQueueResource, we guarantee 
SchedulerNode's consistency. 

> Resource update during NM graceful decommission
> -----------------------------------------------
>                 Key: YARN-3223
>                 URL: https://issues.apache.org/jira/browse/YARN-3223
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: graceful, nodemanager, resourcemanager
>    Affects Versions: 2.7.1
>            Reporter: Junping Du
>            Assignee: Brook Zhou
>         Attachments: YARN-3223-v0.patch, YARN-3223-v1.patch, 
> YARN-3223-v2.patch, YARN-3223-v3.patch
> During NM graceful decommission, we should handle resource update properly, 
> include: make RMNode keep track of old resource for possible rollback, keep 
> available resource to 0 and used resource get updated when
> container finished.

This message was sent by Atlassian JIRA

Reply via email to