[
https://issues.apache.org/jira/browse/YARN-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14705667#comment-14705667
]
Advertising
Chang Li commented on YARN-4045:
--------------------------------
Hi [~wangda], for the first case, should we check availableResource of root
queue when a node gets removed? Then if available memory is negative, we
proceed to unreserve some resource until the available memory of root queue
becomes positive.
> Negative avaialbleMB is being reported for root queue.
> ------------------------------------------------------
>
> Key: YARN-4045
> URL: https://issues.apache.org/jira/browse/YARN-4045
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.7.1
> Reporter: Rushabh S Shah
>
> We recently deployed 2.7 in one of our cluster.
> We are seeing negative availableMB being reported for queue=root.
> This is from the jmx output:
> {noformat}
> <clusterMetrics>
> ...
> <availableMB>-163328</availableMB>
> ...
> </clusterMetrics>
> {noformat}
> The following is the RM log:
> {noformat}
> 2015-08-10 14:42:28,280 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854
> absoluteUsedCapacity=1.0029854 used=<memory:5332480, vCores:6202>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:28,404 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743
> absoluteUsedCapacity=1.0032743 used=<memory:5334016, vCores:6212>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:30,913 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854
> absoluteUsedCapacity=1.0029854 used=<memory:5332480, vCores:6202>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:30,913 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743
> absoluteUsedCapacity=1.0032743 used=<memory:5334016, vCores:6212>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:33,093 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854
> absoluteUsedCapacity=1.0029854 used=<memory:5332480, vCores:6202>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:33,093 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743
> absoluteUsedCapacity=1.0032743 used=<memory:5334016, vCores:6212>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:35,548 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854
> absoluteUsedCapacity=1.0029854 used=<memory:5332480, vCores:6202>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:35,549 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743
> absoluteUsedCapacity=1.0032743 used=<memory:5334016, vCores:6212>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:39,088 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854
> absoluteUsedCapacity=1.0029854 used=<memory:5332480, vCores:6202>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:39,089 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743
> absoluteUsedCapacity=1.0032743 used=<memory:5334016, vCores:6212>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:39,338 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854
> absoluteUsedCapacity=1.0029854 used=<memory:5332480, vCores:6202>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:39,339 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743
> absoluteUsedCapacity=1.0032743 used=<memory:5334016, vCores:6212>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:39,757 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854
> absoluteUsedCapacity=1.0029854 used=<memory:5332480, vCores:6202>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:39,758 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743
> absoluteUsedCapacity=1.0032743 used=<memory:5334016, vCores:6212>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:43,056 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854
> absoluteUsedCapacity=1.0029854 used=<memory:5332480, vCores:6202>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:43,070 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743
> absoluteUsedCapacity=1.0032743 used=<memory:5334016, vCores:6212>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:44,486 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854
> absoluteUsedCapacity=1.0029854 used=<memory:5332480, vCores:6202>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:44,487 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743
> absoluteUsedCapacity=1.0032743 used=<memory:5334016, vCores:6212>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:44,886 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854
> absoluteUsedCapacity=1.0029854 used=<memory:5332480, vCores:6202>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:44,886 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743
> absoluteUsedCapacity=1.0032743 used=<memory:5334016, vCores:6212>
> cluster=<memory:5316608, vCores:28320>
> 2015-08-10 14:42:47,401 [ResourceManager Event Processor] INFO
> capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854
> absoluteUsedCapacity=1.0029854 used=<memory:5332480, vCores:6202>
> cluster=<memory:5316608, vCores:28320>
> {noformat}
> bq. used=<memory:5332480, vCores:6202> cluster=<memory:5316608, vCores:28320>
> For root queue, usedCapacity is more than totalCapacity
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)