Gaurav Suman created YARN-9882:
----------------------------------
Summary: QueueMetrics not coming in Capacity Scheduler with Node
Label Configuration
Key: YARN-9882
URL: https://issues.apache.org/jira/browse/YARN-9882
Project: Hadoop YARN
Issue Type: Bug
Components: capacity scheduler, metrics, scheduler
Reporter: Gaurav Suman
I am having a capacity scheduler setup with two queues - "low-priority",
"regular-priority". There are two node-labels "low" and "regular". low-priority
queue has 100% access to "low" node-label and regular-priority queue has 100%
access to "regular" node label.
The yarn ui capacity scheduler configuration -
[https://i.stack.imgur.com/gOARn.png]
When i see the QueueMetrics emitted by queue "low-priority" and
"regular-priority" in (http://rm-ip:port/jmx), then it shows correct values of
availableMB and availableVCores, pendingMB=0 etc. but when I submit a job to
any queue, there is no update in jmx metrics like pendingMB, pendingVcores,
availableMB, availableVCores etc. only AppsRunning, ActiveApplications etc. are
getting updated. The pendingMB, pendingVcores remains always 0 and there is no
changes in availableMB, availableVcores, appsRunning and activeApplications
shows correct value as 1. Not able to find why the metrics is not getting
updated after job submission.
The issue comes only when node-label is enabled. When node-label is disabled
and only queue is used everything works fine.
The capacity scheduler configuration(capacity-scheduler.xml):
{code:java}
<configuration>
<property>
<name>yarn.scheduler.capacity.maximum-applications</name>
<value>5000</value>
</property>
<property>
<name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
<value>0.2</value>
</property>
<property>
<name>yarn.scheduler.capacity.resource-calculator</name>
<value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>
<description>
The ResourceCalculator implementation to be used to compare
Resources in the scheduler.
The default i.e. DefaultResourceCalculator only uses Memory while
DominantResourceCalculator uses dominant-resource to compare
multi-dimensional resources such as Memory, CPU etc.
</description>
</property>
<property>
<name>yarn.scheduler.capacity.root.queues</name>
<value>low-priority,regular-priority</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.accessible-node-labels</name>
<value>*</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.accessible-node-labels.regular.capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.accessible-node-labels.regular.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.accessible-node-labels.low.capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.accessible-node-labels.low.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.default.state</name>
<value>RUNNING</value>
<description>
The state of the default queue. State can be one of RUNNING or STOPPED.
</description>
</property>
<property>
<name>yarn.scheduler.capacity.root.default.acl_submit_applications</name>
<value>*</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.default.acl_administer_queue</name>
<value>*</value>
<description>
The ACL of who can administer jobs on the default queue.
</description>
</property>
<property>
<name>yarn.scheduler.capacity.node-locality-delay</name>
<value>40</value>
</property>
<property>
<name>yarn.scheduler.capacity.queue-mappings-override.enable</name>
<value>false</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.low-priority.capacity</name>
<value>50</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.low-priority.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.low-priority.ordering-policy</name>
<value>fair</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.low-priority.accessible-node-labels</name>
<value>low</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.low-priority.default-node-label-expression</name>
<value>low</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.low-priority.accessible-node-labels.low.capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.low-priority.accessible-node-labels.low.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.low-priority.default.state</name>
<value>RUNNING</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.regular-priority.capacity</name>
<value>50</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.regular-priority.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.regular-priority.ordering-policy</name>
<value>fair</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.regular-priority.accessible-node-labels</name>
<value>regular</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.regular-priority.default-node-label-expression</name>
<value>regular</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.regular-priority.accessible-node-labels.regular.capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.regular-priority.accessible-node-labels.regular.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.regular-priority.default.state</name>
<value>RUNNING</value>
</property>
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]