[ 
https://issues.apache.org/jira/browse/YARN-4618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110218#comment-15110218
 ] 

Naganarasimha G R commented on YARN-4618:
-----------------------------------------

[~hex108], Whats the max number of containers run in your cluster ? i remember 
it to be a very large value, have you faced any problem like this any where ?

> RM Stops allocating containers if large number of pending containers
> --------------------------------------------------------------------
>
>                 Key: YARN-4618
>                 URL: https://issues.apache.org/jira/browse/YARN-4618
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Bibin A Chundatt
>            Assignee: Bibin A Chundatt
>            Priority: Critical
>
> In  one of the test found that when RM is having so many pending container 
> request to be served RM Stops assigning containers.
> Root total = 6 lakhs containers = 
> Queue 1 = 3 lakh containers = 1328800000 MB
> Queue 2 = 3+ lakh containers = 1428800000 MB
> Each container request is with 4GB. 
> {{ParentQueue#assignContainers}} is as below
> {noformat}
>     // Check if this queue need more resource, simply skip allocation if this
>     // queue doesn't need more resources.
>     if (!super.hasPendingResourceRequest(node.getPartition(),
>         clusterResource, schedulingMode)) {
>       if (LOG.isDebugEnabled()) {
>         LOG.debug("Skip this queue=" + getQueuePath()
>             + ", because it doesn't need more resource, schedulingMode="
>             + schedulingMode.name() + " node-partition=" + 
> node.getPartition());
>       }
>       return CSAssignment.NULL_ASSIGNMENT;
>     }
> {noformat}
> When the pending resource > MAX VALUE and become *negative*  {{- 167XXXXXXX 
> MB}} and always NULL_ASSIGNMENT is return.
> Tools used to test SLS.
> For checking pendingResource request we should first check any pending 
> containers (from getMetrics()) are there to be served. If pending containers 
> are available then return true else consider other check for increase request.
> Thoughts ??



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to