[
https://issues.apache.org/jira/browse/YARN-9596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16892081#comment-16892081
]
Muhammad Samir Khan commented on YARN-9596:
-------------------------------------------
The findbugs warnings are from branch-3.0 (pre-patch).
The unit test failures are also happening in branch-3.0. They just happen a
little later since the assert statement is later in branch-3.0. Some of the
tests fail if I run all tests in TestNodeLabelContainerAllocation but not if I
run the specific tests by themselves.
> QueueMetrics has incorrect metrics when labelled partitions are involved
> ------------------------------------------------------------------------
>
> Key: YARN-9596
> URL: https://issues.apache.org/jira/browse/YARN-9596
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacity scheduler
> Affects Versions: 2.8.0, 3.3.0
> Reporter: Muhammad Samir Khan
> Assignee: Muhammad Samir Khan
> Priority: Major
> Attachments: Screen Shot 2019-06-03 at 4.41.45 PM.png, Screen Shot
> 2019-06-03 at 4.44.15 PM.png, YARN-9596-branch-3.0.004.patch,
> YARN-9596.001.patch, YARN-9596.002.patch, YARN-9596.003.patch
>
>
> After YARN-6467, QueueMetrics should only be tracking metrics for the default
> partition. However, the metrics are incorrect when labelled partitions are
> involved.
> Steps to reproduce
> ==============
> # Configure capacity-scheduler.xml with label configuration
> # Add label "test" to cluster and replace label on node1 to be "test"
> # Note down "totalMB" at
> <resourcemanager.webapp.address:port>/ws/v1/cluster/metrics
> # Start first job on test queue.
> # Start second job on default queue (does not work if the order of two jobs
> is swapped).
> # While the two applications are running, the "totalMB" at
> <resourcemanager.webapp.address:port>/ws/v1/cluster/metrics will go down by
> the amount of MB used by the first job (screenshots attached).
> Alternately:
> In
> TestNodeLabelContainerAllocation.testQueueMetricsWithLabelsOnDefaultLabelNode(),
> add the following line at the end of the test before rm1.close():
> CSQueue rootQueue = cs.getRootQueue();
> assertEquals(10*GB,
> rootQueue.getMetrics().getAvailableMB() +
> rootQueue.getMetrics().getAllocatedMB());
> There are two nodes of 10GB each and only one of them have a non-default
> label. The test will also fail against 20*GB check.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]