Muhammad Samir Khan created YARN-9596:
-----------------------------------------
Summary: QueueMetrics has incorrect metrics when labelled
partitions are involved
Key: YARN-9596
URL: https://issues.apache.org/jira/browse/YARN-9596
Project: Hadoop YARN
Issue Type: Bug
Components: capacity scheduler
Reporter: Muhammad Samir Khan
Attachments: Screen Shot 2019-06-03 at 4.41.45 PM.png, Screen Shot
2019-06-03 at 4.44.15 PM.png
After YARN-6467, QueueMetrics should only be tracking metrics for the default
partition. However, the metrics are incorrect when labelled partitions are
involved.
Steps to reproduce
==============
# Configure capacity-scheduler.xml with label configuration
# Add label "test" to cluster and replace label on node1 to be "test"
# Note down "totalMB" at
<resourcemanager.webapp.address:port>/ws/v1/cluster/metrics
# Start first job on test queue.
# Start second job on default queue (does not work if the order of two jobs is
swapped).
# While the two applications are running, the "totalMB" at
<resourcemanager.webapp.address:port>/ws/v1/cluster/metrics will go down by the
amount of MB used by the first job (screenshots attached).
Alternately:
In
TestNodeLabelContainerAllocation.testQueueMetricsWithLabelsOnDefaultLabelNode(),
add the following line at the end of the test before rm1.close():
CSQueue rootQueue = cs.getRootQueue();
assertEquals(10*GB,
rootQueue.getMetrics().getAvailableMB() +
rootQueue.getMetrics().getAllocatedMB());
There are two nodes of 10GB each and only one of them have a non-default label.
The test will also fail against 20*GB check.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]