Brook Zhou commented on YARN-3223:

Thanks [~leftnoteasy],  [~djp] for review.

bq. Suggest to use CapacityScheduler#updateNodeAndQueueResource to update 
resources, we need to update queue's resource, cluster metrics as well.
That makes sense. I'm currently setting SchedulerNode's usedResource to equal 
to totalResource, and keeping totalResource the same. If we use that function, 
it means totalResource should be set equal to usedResource, and on recommission 
we should just revert back to the original totalResource? I like your way 

bq. When async scheduling enabled, we need to make sure decommissioing node's 
total resource is updated so no new container will be allocated on these nodes.
Even if async scheduling is enabled, we will update the total resource on 
NODE_UPDATE event to equal to current usedResource, async scheduling thread 
will not allocate containers to the node.

bq.  RMNode itself (RMNode.getState()) is already include the necessary info, 
so the boolean parameter sounds like redundant
Agreed. I will let the scheduler decide the current state directly using that 

bq.  I think we need separated test case to cover resource update during NM 
Yes, that is definitely going to be added. I just wanted to see if my general 
ideas were okay with the community. Thanks!

> Resource update during NM graceful decommission
> -----------------------------------------------
>                 Key: YARN-3223
>                 URL: https://issues.apache.org/jira/browse/YARN-3223
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>    Affects Versions: 2.7.1
>            Reporter: Junping Du
>            Assignee: Brook Zhou
>         Attachments: YARN-3223-v0.patch, YARN-3223-v1.patch
> During NM graceful decommission, we should handle resource update properly, 
> include: make RMNode keep track of old resource for possible rollback, keep 
> available resource to 0 and used resource get updated when
> container finished.

This message was sent by Atlassian JIRA

Reply via email to