[
https://issues.apache.org/jira/browse/YARN-9088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16819455#comment-16819455
]
Karthik Palaniappan commented on YARN-9088:
-------------------------------------------
You'd also need to change how usedCapacity from YARN-6195 is calculated. It has
similar logic for only the default partition.
> Non-exclusive labels break QueueMetrics
> ---------------------------------------
>
> Key: YARN-9088
> URL: https://issues.apache.org/jira/browse/YARN-9088
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacity scheduler, resourcemanager
> Affects Versions: 2.8.5
> Reporter: Brandon Scheller
> Priority: Major
> Labels: metrics, nodelabel
>
> QueueMetrics are broken (random/negative values) when non-exclusive labels
> are being used and unlabeled containers run on labeled nodes.
> This is caused by the change in the patch here:
> https://issues.apache.org/jira/browse/YARN-6467
> It assumes that a container's label will be the same as the node's label that
> it is running on.
> If you look within the patch, sometimes metrics are updated using the
> request.getNodeLabelExpression(). And sometimes they are updated using
> node.getPartition().
> This means that in the case where the node is labeled while the container
> request isn't, these metrics only get updated when referring to the default
> queue. This stops metrics from balancing out and results in incorrect and
> negative values in QueueMetrics.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]