[ 
https://issues.apache.org/jira/browse/YARN-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maysam Yabandeh updated YARN-1076:
----------------------------------

    Description: 
LeafQueue#assignToQueue rejects newly available containers if 
potentialNewCapacity > absoluteMaxCapacity:
{code:java}
  private synchronized boolean assignToQueue(Resource clusterResource, 
      Resource required) {
    // Check how of the cluster's absolute capacity we are currently using...
    float potentialNewCapacity = 
        Resources.divide(
            resourceCalculator, clusterResource, 
            Resources.add(usedResources, required), 
            clusterResource);
    if (potentialNewCapacity > absoluteMaxCapacity) {
      //... 
      return false;
    }
    return true;
  }
{code}

The usedResources, which is used to computed potentialNewCapacity, is composed 
of both actual and reserved containers. So, a prior reservation could causes RM 
to reject newly available containers, despite the starvation report.


  was:
LeafQueue#assignContainers rejects newly available containers if 
#needContainers returns false:
{code:java}
          if (!needContainers(application, priority, required)) {
            continue;
          }
{code}

When the application has already reserved all the required containers, 
#needContainers returns false as long as no starvation is reported:
{code:java}
return (((starvation + requiredContainers) - reservedContainers) > 0);
{code}

where starvation is computed based on the attempts on re-reserving a resource. 
On the other hand, a resource is re-reserved via #assignContainersOnNode only 
if it passed the #needContainers precondition:
{code:java}
          // Do we need containers at this 'priority'?
          if (!needContainers(application, priority, required)) {
            continue;
          }

          //.
          //.
          //.
          
          // Try to schedule
          CSAssignment assignment =  
            assignContainersOnNode(clusterResource, node, application, 
priority, 
                null);
{code}

In other words, once needContainers returns false due to a reservation, it 
keeps rejecting newly available resources, since no reservation is ever 
attempted.

    
> RM gets stuck with a reservation, ignoring new containers
> ---------------------------------------------------------
>
>                 Key: YARN-1076
>                 URL: https://issues.apache.org/jira/browse/YARN-1076
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>            Reporter: Maysam Yabandeh
>            Priority: Minor
>         Attachments: YARN-1076.patch
>
>
> LeafQueue#assignToQueue rejects newly available containers if 
> potentialNewCapacity > absoluteMaxCapacity:
> {code:java}
>   private synchronized boolean assignToQueue(Resource clusterResource, 
>       Resource required) {
>     // Check how of the cluster's absolute capacity we are currently using...
>     float potentialNewCapacity = 
>         Resources.divide(
>             resourceCalculator, clusterResource, 
>             Resources.add(usedResources, required), 
>             clusterResource);
>     if (potentialNewCapacity > absoluteMaxCapacity) {
>       //... 
>       return false;
>     }
>     return true;
>   }
> {code}
> The usedResources, which is used to computed potentialNewCapacity, is 
> composed of both actual and reserved containers. So, a prior reservation 
> could causes RM to reject newly available containers, despite the starvation 
> report.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to