[
https://issues.apache.org/jira/browse/YARN-4618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110215#comment-15110215
]
Naganarasimha G R commented on YARN-4618:
-----------------------------------------
Good catch [~bibinchundatt] !
I think {{org.apache.hadoop.yarn.api.records.Resource}} should have used *long*
for memory atleast.
Though in the normal scenario we might not get to see such high #containers but
definitely in the future we can see each container asking more Memory (like
100GB or more) then it will get easily reproduced.
> RM Stops allocating containers if large number of pending containers
> --------------------------------------------------------------------
>
> Key: YARN-4618
> URL: https://issues.apache.org/jira/browse/YARN-4618
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Bibin A Chundatt
> Assignee: Bibin A Chundatt
> Priority: Critical
>
> In one of the test found that when RM is having so many pending container
> request to be served RM Stops assigning containers.
> Root total = 6 lakhs containers =
> Queue 1 = 3 lakh containers = 1328800000 MB
> Queue 2 = 3+ lakh containers = 1428800000 MB
> Each container request is with 4GB.
> {{ParentQueue#assignContainers}} is as below
> {noformat}
> // Check if this queue need more resource, simply skip allocation if this
> // queue doesn't need more resources.
> if (!super.hasPendingResourceRequest(node.getPartition(),
> clusterResource, schedulingMode)) {
> if (LOG.isDebugEnabled()) {
> LOG.debug("Skip this queue=" + getQueuePath()
> + ", because it doesn't need more resource, schedulingMode="
> + schedulingMode.name() + " node-partition=" +
> node.getPartition());
> }
> return CSAssignment.NULL_ASSIGNMENT;
> }
> {noformat}
> When the pending resource > MAX VALUE and become *negative* {{- 167XXXXXXX
> MB}} and always NULL_ASSIGNMENT is return.
> Tools used to test SLS.
> For checking pendingResource request we should first check any pending
> containers (from getMetrics()) are there to be served. If pending containers
> are available then return true else consider other check for increase request.
> Thoughts ??
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)