Muhammad Samir Khan created YARN-9596:
-----------------------------------------

             Summary: QueueMetrics has incorrect metrics when labelled 
partitions are involved
                 Key: YARN-9596
                 URL: https://issues.apache.org/jira/browse/YARN-9596
             Project: Hadoop YARN
          Issue Type: Bug
          Components: capacity scheduler
            Reporter: Muhammad Samir Khan
         Attachments: Screen Shot 2019-06-03 at 4.41.45 PM.png, Screen Shot 
2019-06-03 at 4.44.15 PM.png

After YARN-6467, QueueMetrics should only be tracking metrics for the default 
partition. However, the metrics are incorrect when labelled partitions are 
involved.

Steps to reproduce

==============
 # Configure capacity-scheduler.xml with label configuration
 # Add label "test" to cluster and replace label on node1 to be "test"
 # Note down "totalMB" at 
<resourcemanager.webapp.address:port>/ws/v1/cluster/metrics
 # Start first job on test queue.
 # Start second job on default queue (does not work if the order of two jobs is 
swapped).
 # While the two applications are running, the "totalMB" at 
<resourcemanager.webapp.address:port>/ws/v1/cluster/metrics will go down by the 
amount of MB used by the first job (screenshots attached).

Alternately:

In 
TestNodeLabelContainerAllocation.testQueueMetricsWithLabelsOnDefaultLabelNode(),
 add the following line at the end of the test before rm1.close():

CSQueue rootQueue = cs.getRootQueue();
assertEquals(10*GB,
 rootQueue.getMetrics().getAvailableMB() + 
rootQueue.getMetrics().getAllocatedMB());

There are two nodes of 10GB each and only one of them have a non-default label. 
The test will also fail against 20*GB check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to