[ https://issues.apache.org/jira/browse/YARN-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Maysam Yabandeh updated YARN-1076: ---------------------------------- Description: LeafQueue#assignToQueue rejects newly available containers if potentialNewCapacity > absoluteMaxCapacity: {code:java} private synchronized boolean assignToQueue(Resource clusterResource, Resource required) { // Check how of the cluster's absolute capacity we are currently using... float potentialNewCapacity = Resources.divide( resourceCalculator, clusterResource, Resources.add(usedResources, required), clusterResource); if (potentialNewCapacity > absoluteMaxCapacity) { //... return false; } return true; } {code} The usedResources, which is used to computed potentialNewCapacity, is composed of both actual and reserved containers. So, a prior reservation could causes RM to reject newly available containers, despite the starvation report. was: LeafQueue#assignContainers rejects newly available containers if #needContainers returns false: {code:java} if (!needContainers(application, priority, required)) { continue; } {code} When the application has already reserved all the required containers, #needContainers returns false as long as no starvation is reported: {code:java} return (((starvation + requiredContainers) - reservedContainers) > 0); {code} where starvation is computed based on the attempts on re-reserving a resource. On the other hand, a resource is re-reserved via #assignContainersOnNode only if it passed the #needContainers precondition: {code:java} // Do we need containers at this 'priority'? if (!needContainers(application, priority, required)) { continue; } //. //. //. // Try to schedule CSAssignment assignment = assignContainersOnNode(clusterResource, node, application, priority, null); {code} In other words, once needContainers returns false due to a reservation, it keeps rejecting newly available resources, since no reservation is ever attempted. > RM gets stuck with a reservation, ignoring new containers > --------------------------------------------------------- > > Key: YARN-1076 > URL: https://issues.apache.org/jira/browse/YARN-1076 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Reporter: Maysam Yabandeh > Priority: Minor > Attachments: YARN-1076.patch > > > LeafQueue#assignToQueue rejects newly available containers if > potentialNewCapacity > absoluteMaxCapacity: > {code:java} > private synchronized boolean assignToQueue(Resource clusterResource, > Resource required) { > // Check how of the cluster's absolute capacity we are currently using... > float potentialNewCapacity = > Resources.divide( > resourceCalculator, clusterResource, > Resources.add(usedResources, required), > clusterResource); > if (potentialNewCapacity > absoluteMaxCapacity) { > //... > return false; > } > return true; > } > {code} > The usedResources, which is used to computed potentialNewCapacity, is > composed of both actual and reserved containers. So, a prior reservation > could causes RM to reject newly available containers, despite the starvation > report. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira