[
https://issues.apache.org/jira/browse/YARN-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maysam Yabandeh updated YARN-1076:
----------------------------------
Description:
LeafQueue#assignToQueue rejects newly available containers if
potentialNewCapacity > absoluteMaxCapacity:
{code:java}
private synchronized boolean assignToQueue(Resource clusterResource,
Resource required) {
// Check how of the cluster's absolute capacity we are currently using...
float potentialNewCapacity =
Resources.divide(
resourceCalculator, clusterResource,
Resources.add(usedResources, required),
clusterResource);
if (potentialNewCapacity > absoluteMaxCapacity) {
//...
return false;
}
return true;
}
{code}
The usedResources, which is used to computed potentialNewCapacity, is composed
of both actual and reserved containers. So, a prior reservation could causes RM
to reject newly available containers, despite the starvation report.
was:
LeafQueue#assignContainers rejects newly available containers if
#needContainers returns false:
{code:java}
if (!needContainers(application, priority, required)) {
continue;
}
{code}
When the application has already reserved all the required containers,
#needContainers returns false as long as no starvation is reported:
{code:java}
return (((starvation + requiredContainers) - reservedContainers) > 0);
{code}
where starvation is computed based on the attempts on re-reserving a resource.
On the other hand, a resource is re-reserved via #assignContainersOnNode only
if it passed the #needContainers precondition:
{code:java}
// Do we need containers at this 'priority'?
if (!needContainers(application, priority, required)) {
continue;
}
//.
//.
//.
// Try to schedule
CSAssignment assignment =
assignContainersOnNode(clusterResource, node, application,
priority,
null);
{code}
In other words, once needContainers returns false due to a reservation, it
keeps rejecting newly available resources, since no reservation is ever
attempted.
> RM gets stuck with a reservation, ignoring new containers
> ---------------------------------------------------------
>
> Key: YARN-1076
> URL: https://issues.apache.org/jira/browse/YARN-1076
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Reporter: Maysam Yabandeh
> Priority: Minor
> Attachments: YARN-1076.patch
>
>
> LeafQueue#assignToQueue rejects newly available containers if
> potentialNewCapacity > absoluteMaxCapacity:
> {code:java}
> private synchronized boolean assignToQueue(Resource clusterResource,
> Resource required) {
> // Check how of the cluster's absolute capacity we are currently using...
> float potentialNewCapacity =
> Resources.divide(
> resourceCalculator, clusterResource,
> Resources.add(usedResources, required),
> clusterResource);
> if (potentialNewCapacity > absoluteMaxCapacity) {
> //...
> return false;
> }
> return true;
> }
> {code}
> The usedResources, which is used to computed potentialNewCapacity, is
> composed of both actual and reserved containers. So, a prior reservation
> could causes RM to reject newly available containers, despite the starvation
> report.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira