Tao Yang created YARN-9432:
------------------------------

             Summary: Excess reserved containers may exist for a long time 
after its request has been cancelled or satisfied when multi-nodes enabled
                 Key: YARN-9432
                 URL: https://issues.apache.org/jira/browse/YARN-9432
             Project: Hadoop YARN
          Issue Type: Bug
          Components: capacityscheduler
            Reporter: Tao Yang
            Assignee: Tao Yang


Reserved containers may change to be excess after its request has been 
cancelled or satisfied, excess reserved containers need to be unreserved 
quickly to release resource for others.

For multi-nodes disabled scenario, excess reserved containers can be quickly 
released in next node heartbeat, the calling stack is 
CapacityScheduler#nodeUpdate -->  CapacityScheduler#allocateContainersToNode 
--> CapacityScheduler#allocateContainerOnSingleNode. 

But for multi-nodes enabled scenario, excess reserved containers have chance to 
be released only in allocation process, key phase of the calling stack is 
LeafQueue#assignContainers --> LeafQueue#allocateFromReservedContainer. 
According to this, excess reserved containers may not be released until its 
queue has pending request and has chance to be allocated, and the worst is that 
excess reserved containers will never be released and keep holding resource if 
there is no additional pending request for this queue.

To solve this problem, my opinion is to directly kill excess reserved 
containers when request is satisfied (in FiCaSchedulerApp#apply) or the 
allocation number of resource-requests/scheduling-requests is updated to be 0 
(in SchedulerApplicationAttempt#updateResourceRequests / 
SchedulerApplicationAttempt#updateSchedulingRequests).

Please feel free to give your suggestions. Thanks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to