[
https://issues.apache.org/jira/browse/YARN-9882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949573#comment-16949573
]
Eric Payne commented on YARN-9882:
----------------------------------
[~gaurav.suman], for the sake of legacy, the metrics outside of the sections
labelled "...ByPartition" only reflect the resource usage of the default
partition. For each partition, these metrics are included in the
"...ByPartition" sections. If one wants the sum of all resources in all
partitions, it is necessary to sum the metrics for each partition.
The history of this is in YARN-6467 and others referenced there.
There are currently problems with the accuracy of all of these metrics. They
are being worked by [~rmanikandan] in the following JIRAs:
YARN-6492
YARN-9767
YARN-9773
> QueueMetrics not coming in Capacity Scheduler with Node Label Configuration
> ---------------------------------------------------------------------------
>
> Key: YARN-9882
> URL: https://issues.apache.org/jira/browse/YARN-9882
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacity scheduler, metrics, scheduler
> Reporter: Gaurav Suman
> Priority: Major
>
> I am having a capacity scheduler setup with two queues - "low-priority",
> "regular-priority". There are two node-labels "low" and "regular".
> low-priority queue has 100% access to "low" node-label and regular-priority
> queue has 100% access to "regular" node label.
> The yarn ui capacity scheduler configuration -
> [https://i.stack.imgur.com/gOARn.png]
> When i see the QueueMetrics emitted by queue "low-priority" and
> "regular-priority" in (http://rm-ip:port/jmx), then it shows correct values
> of availableMB and availableVCores, pendingMB=0 etc. but when I submit a job
> to any queue, there is no update in jmx metrics like pendingMB,
> pendingVcores, availableMB, availableVCores etc. only AppsRunning,
> ActiveApplications etc. are getting updated. The pendingMB, pendingVcores
> remains always 0 and there is no changes in availableMB, availableVcores,
> appsRunning and activeApplications shows correct value as 1. Not able to find
> why the metrics is not getting updated after job submission.
> The issue comes only when node-label is enabled. When node-label is disabled
> and only queue is used everything works fine.
> The capacity scheduler configuration(capacity-scheduler.xml):
> {code:java}
> <configuration>
> <property>
> <name>yarn.scheduler.capacity.maximum-applications</name>
> <value>5000</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
> <value>0.2</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.resource-calculator</name>
>
> <value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>
> <description>
> The ResourceCalculator implementation to be used to compare
> Resources in the scheduler.
> The default i.e. DefaultResourceCalculator only uses Memory while
> DominantResourceCalculator uses dominant-resource to compare
> multi-dimensional resources such as Memory, CPU etc.
> </description>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.queues</name>
> <value>low-priority,regular-priority</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.capacity</name>
> <value>100</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.maximum-capacity</name>
> <value>100</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.accessible-node-labels</name>
> <value>*</value>
> </property>
> <property>
>
> <name>yarn.scheduler.capacity.root.accessible-node-labels.regular.capacity</name>
> <value>100</value>
> </property>
> <property>
>
> <name>yarn.scheduler.capacity.root.accessible-node-labels.regular.maximum-capacity</name>
> <value>100</value>
> </property>
> <property>
>
> <name>yarn.scheduler.capacity.root.accessible-node-labels.low.capacity</name>
> <value>100</value>
> </property>
> <property>
>
> <name>yarn.scheduler.capacity.root.accessible-node-labels.low.maximum-capacity</name>
> <value>100</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.default.state</name>
> <value>RUNNING</value>
> <description>
> The state of the default queue. State can be one of RUNNING or
> STOPPED.
> </description>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.default.acl_submit_applications</name>
> <value>*</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.default.acl_administer_queue</name>
> <value>*</value>
> <description>
> The ACL of who can administer jobs on the default queue.
> </description>
> </property>
> <property>
> <name>yarn.scheduler.capacity.node-locality-delay</name>
> <value>40</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.queue-mappings-override.enable</name>
> <value>false</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.low-priority.capacity</name>
> <value>50</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.low-priority.maximum-capacity</name>
> <value>100</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.low-priority.ordering-policy</name>
> <value>fair</value>
> </property>
> <property>
>
> <name>yarn.scheduler.capacity.root.low-priority.accessible-node-labels</name>
> <value>low</value>
> </property>
> <property>
>
> <name>yarn.scheduler.capacity.root.low-priority.default-node-label-expression</name>
> <value>low</value>
> </property>
> <property>
>
> <name>yarn.scheduler.capacity.root.low-priority.accessible-node-labels.low.capacity</name>
> <value>100</value>
> </property>
> <property>
>
> <name>yarn.scheduler.capacity.root.low-priority.accessible-node-labels.low.maximum-capacity</name>
> <value>100</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.low-priority.default.state</name>
> <value>RUNNING</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.regular-priority.capacity</name>
> <value>50</value>
> </property>
> <property>
>
> <name>yarn.scheduler.capacity.root.regular-priority.maximum-capacity</name>
> <value>100</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.regular-priority.ordering-policy</name>
> <value>fair</value>
> </property>
> <property>
>
> <name>yarn.scheduler.capacity.root.regular-priority.accessible-node-labels</name>
> <value>regular</value>
> </property>
> <property>
>
> <name>yarn.scheduler.capacity.root.regular-priority.default-node-label-expression</name>
> <value>regular</value>
> </property>
> <property>
>
> <name>yarn.scheduler.capacity.root.regular-priority.accessible-node-labels.regular.capacity</name>
> <value>100</value>
> </property>
> <property>
>
> <name>yarn.scheduler.capacity.root.regular-priority.accessible-node-labels.regular.maximum-capacity</name>
> <value>100</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.regular-priority.default.state</name>
> <value>RUNNING</value>
> </property>
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]