[
https://issues.apache.org/jira/browse/YARN-10813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363475#comment-17363475
]
Andras Gyori commented on YARN-10813:
-------------------------------------
Thank you [~snemeth] for your suggestions. It seems that this fix is just the
tip of the iceberg. The problem is that:
* A user can configure a node label for a queue
* But without setting the capacity 100 for ROOT to the same label manually,
eg. yarn.scheduler.capacity.root.accessible-node-labels.test_label.capacity
100, the absoluteCapacity will be 0 all the time.
* This issue stands for both percentage and absolute resource as well, and the
node labels for absolute resources were not even tested (see
TestAbsoluteResourceConfiguration).
A solution to this problem could be for capacity to default to 100 for every
undefined node label.
> Root queue capacity is not set when using node labels
> -----------------------------------------------------
>
> Key: YARN-10813
> URL: https://issues.apache.org/jira/browse/YARN-10813
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Andras Gyori
> Assignee: Andras Gyori
> Priority: Major
> Attachments: YARN-10813.001.patch
>
>
> CapacitySchedulerConfiguration#getNonLabeledQueueCapacity handles root in the
> following way:
> {code:java}
> if (absoluteResourceConfigured || configuredWeightAsCapacity(
> configuredCapacity)) {
> // Return capacity in percentage as 0 for non-root queues and 100 for
> // root.From AbstractCSQueue, absolute resource will be parsed and
> // updated. Once nodes are added/removed in cluster, capacity in
> // percentage will also be re-calculated.
> return queue.equals("root") ? 100.0f : 0f;
> }
> {code}
> CapacitySchedulerConfiguration#internalGetLabeledQueueCapacity on the other
> hand does not take root queue into consideration:
> {code:java}
> if (absoluteResourceConfigured || configuredWeightAsCapacity(
> configuredCapacity)) {
> // Return capacity in percentage as 0 for non-root queues and 100 for
> // root.From AbstractCSQueue, absolute resource, and weight will be
> parsed
> // and updated separately. Once nodes are added/removed in cluster,
> // capacity is percentage will also be re-calculated.
> return defaultValue;
> }
> float capacity = getFloat(capacityPropertyName, defaultValue);
> {code}
> Due to this, labeled root capacity is 0, which is not set in in
> AbstractCSQueue#derivedCapacityFromAbsoluteConfigurations, because root is
> never in Absolute mode.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]