[
https://issues.apache.org/jira/browse/YARN-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13744683#comment-13744683
]
Maysam Yabandeh commented on YARN-1076:
---------------------------------------
Hi [~ojoshi]. I am observing the problem with a unit test using
MiniYarnCluster. The explanation however is based solely on code walk through.
I did not submit the test case since the problem did not always show up--due to
the non-determinism in MiniYarnCluster.
Anyway, I see that you have already covered that in the objectives of YARN-957:
| Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available
memory. In this case if the original request was made without any locality then
scheduler should unreserve memory on nm1 and allocate requested 2048MB
container on nm2.
> RM gets stuck with a reservation, ignoring new containers
> ---------------------------------------------------------
>
> Key: YARN-1076
> URL: https://issues.apache.org/jira/browse/YARN-1076
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Reporter: Maysam Yabandeh
> Priority: Minor
>
> LeafQueue#assignContainers rejects newly available containers if
> #needContainers returns false:
> {code:java}
> if (!needContainers(application, priority, required)) {
> continue;
> }
> {code}
> When the application has already reserved all the required containers,
> #needContainers returns false as long as no starvation is reported:
> {code:java}
> return (((starvation + requiredContainers) - reservedContainers) > 0);
> {code}
> where starvation is computed based on the attempts on re-reserving a
> resource. On the other hand, a resource is re-reserved via
> #assignContainersOnNode only if it passed the #needContainers precondition:
> {code:java}
> // Do we need containers at this 'priority'?
> if (!needContainers(application, priority, required)) {
> continue;
> }
> //.
> //.
> //.
>
> // Try to schedule
> CSAssignment assignment =
> assignContainersOnNode(clusterResource, node, application,
> priority,
> null);
> {code}
> In other words, once needContainers returns false due to a reservation, it
> keeps rejecting newly available resources, since no reservation is ever
> attempted.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira