[ 
https://issues.apache.org/jira/browse/YARN-4618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110933#comment-15110933
 ] 

Sunil G commented on YARN-4618:
-------------------------------

Thanks [~bibinchundatt].
As [~Naganarasimha Garla]  mentioned, I also feel we can change the data type 
to long. Long run, this looks like a cleaner approach. Currently we can put a 
max check here and solve it. But some other places it may pop-up give your test 
case. 

> RM Stops allocating containers if large number of pending containers
> --------------------------------------------------------------------
>
>                 Key: YARN-4618
>                 URL: https://issues.apache.org/jira/browse/YARN-4618
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Bibin A Chundatt
>            Assignee: Bibin A Chundatt
>            Priority: Critical
>
> In  one of the test found that when RM is having so many pending container 
> request to be served RM Stops assigning containers.
> Cluster simulated is with 100 TB 
> Root total = 600k containers = 
> Queue 1 = 300k containers = 1328800000 MB
> Queue 2 = 300k containers = 1428800000 MB
> Each container request is with 4GB. 
> {{ParentQueue#assignContainers}} is as below
> {noformat}
>     // Check if this queue need more resource, simply skip allocation if this
>     // queue doesn't need more resources.
>     if (!super.hasPendingResourceRequest(node.getPartition(),
>         clusterResource, schedulingMode)) {
>       if (LOG.isDebugEnabled()) {
>         LOG.debug("Skip this queue=" + getQueuePath()
>             + ", because it doesn't need more resource, schedulingMode="
>             + schedulingMode.name() + " node-partition=" + 
> node.getPartition());
>       }
>       return CSAssignment.NULL_ASSIGNMENT;
>     }
> {noformat}
> When the pending resource > MAX VALUE and become *negative*  {{- 167XXXXXXX 
> MB}} and always NULL_ASSIGNMENT is return.
> Tools used to test SLS.
> For checking pendingResource request we should first check any pending 
> containers (from getMetrics()) are there to be served. If pending containers 
> are available then return true else consider other check for increase request.
> Thoughts ??



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to