[
https://issues.apache.org/jira/browse/YARN-11641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17805189#comment-17805189
]
ASF GitHub Bot commented on YARN-11641:
---------------------------------------
tomicooler opened a new pull request, #6435:
URL: https://github.com/apache/hadoop/pull/6435
<!--
Thanks for sending a pull request!
1. If this is your first time, please read our contributor guidelines:
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
2. Make sure your PR title starts with JIRA issue id, e.g.,
'HADOOP-17799. Your PR title ...'.
-->
### Description of PR
WIP: until the other 2 ticket is merged, I'll rebase this PR.
Details in the Jira:
[YARN-11641](https://issues.apache.org/jira/browse/YARN-11641)
Note: it is not possible to rely on the capacityVectors (at least not for
the root queue, which is always in percentage mode with 100%). So I decided to
go with the `checkConfigTypeIsAbsoluteResource` approach.
### How was this patch tested?
Tested manually and added a unit test.
### For code changes:
- [x] Does the title or this PR starts with the corresponding JIRA issue id
(e.g. 'YARN-11641 Your PR title ...')?
- [ ] Object storage: have the integration tests been executed and the
endpoint declared according to the connector-specific documentation?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`,
`NOTICE-binary` files?
> Can't update a queue hierarchy in absolute mode when the configured
> capacities are zero
> ---------------------------------------------------------------------------------------
>
> Key: YARN-11641
> URL: https://issues.apache.org/jira/browse/YARN-11641
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacityscheduler
> Affects Versions: 3.4.0
> Reporter: Tamas Domok
> Assignee: Tamas Domok
> Priority: Major
> Attachments: hierarchy.png
>
>
> h2. Error symptoms
> It is not possible to modify a queue hierarchy in absolute mode when the
> parent or every child queue of the parent has 0 min resource configured.
> {noformat}
> 2024-01-05 15:38:59,016 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager:
> Initialized queue: root.a.c
> 2024-01-05 15:38:59,016 ERROR
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices: Exception
> thrown when modifying configuration.
> java.io.IOException: Failed to re-init queues : Parent=root.a: When absolute
> minResource is used, we must make sure both parent and child all use absolute
> minResource
> {noformat}
> h2. Reproduction
> capacity-scheduler.xml
> {code:xml}
> <?xml version="1.0"?>
> <configuration>
> <property>
> <name>yarn.scheduler.capacity.root.queues</name>
> <value>default,a</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.capacity</name>
> <value>[memory=40960, vcores=16]</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.default.capacity</name>
> <value>[memory=1024, vcores=1]</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.default.maximum-capacity</name>
> <value>[memory=1024, vcores=1]</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.a.capacity</name>
> <value>[memory=0, vcores=0]</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.a.maximum-capacity</name>
> <value>[memory=39936, vcores=15]</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.a.queues</name>
> <value>b,c</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.a.b.capacity</name>
> <value>[memory=0, vcores=0]</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.a.b.maximum-capacity</name>
> <value>[memory=39936, vcores=15]</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.a.c.capacity</name>
> <value>[memory=0, vcores=0]</value>
> </property>
> <property>
> <name>yarn.scheduler.capacity.root.a.c.maximum-capacity</name>
> <value>[memory=39936, vcores=15]</value>
> </property>
> </configuration>
> {code}
> !hierarchy.png!
> updatequeue.xml
> {code:xml}
> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
> <sched-conf>
> <update-queue>
> <queue-name>root.a</queue-name>
> <params>
> <entry>
> <key>capacity</key>
> <value>[memory=1024,vcores=1]</value>
> </entry>
> <entry>
> <key>maximum-capacity</key>
> <value>[memory=39936,vcores=15]</value>
> </entry>
> </params>
> </update-queue>
> </sched-conf>
> {code}
> {code}
> $ curl -X PUT -H 'Content-Type: application/xml' -d @updatequeue.xml
> http://localhost:8088/ws/v1/cluster/scheduler-conf\?user.name\=yarn
> Failed to re-init queues : Parent=root.a: When absolute minResource is used,
> we must make sure both parent and child all use absolute minResource
> {code}
> h2. Root cause
> setChildQueues is called during reinit, where:
> {code:java}
> void setChildQueues(Collection<CSQueue> childQueues) throws IOException {
> writeLock.lock();
> try {
> boolean isLegacyQueueMode =
> queueContext.getConfiguration().isLegacyQueueMode();
> if (isLegacyQueueMode) {
> QueueCapacityType childrenCapacityType =
> getCapacityConfigurationTypeForQueues(childQueues);
> QueueCapacityType parentCapacityType =
> getCapacityConfigurationTypeForQueues(ImmutableList.of(this));
> if (childrenCapacityType == QueueCapacityType.ABSOLUTE_RESOURCE
> || parentCapacityType == QueueCapacityType.ABSOLUTE_RESOURCE) {
> // We don't allow any mixed absolute + {weight, percentage} between
> // children and parent
> if (childrenCapacityType != parentCapacityType &&
> !this.getQueuePath()
> .equals(CapacitySchedulerConfiguration.ROOT)) {
> throw new IOException("Parent=" + this.getQueuePath()
> + ": When absolute minResource is used, we must make sure
> both "
> + "parent and child all use absolute minResource");
> }
> {code}
> The parent or childrenCapacityType will be considered as PERCENTAGE, because
> getCapacityConfigurationTypeForQueues fails to detect the absolute mode, here:
> {code:java}
> if
> (!queue.getQueueResourceQuotas().getConfiguredMinResource(nodeLabel)
> .equals(Resources.none())) {
> absoluteMinResSet = true;
> {code}
> (It only happens in legacy queue mode.)
> h2. Possible fixes
> Possible fix in AbstractParentQueue.getCapacityConfigurationTypeForQueues
> using the capacityVector:
> {code:java}
> for (CSQueue queue : queues) {
> for (String nodeLabel : queueCapacities.getExistingNodeLabels()) {
> Set<QueueCapacityVector.ResourceUnitCapacityType>
> definedCapacityTypes =
>
> queue.getConfiguredCapacityVector(nodeLabel).getDefinedCapacityTypes();
> if (definedCapacityTypes.size() == 1) {
> QueueCapacityVector.ResourceUnitCapacityType next =
> definedCapacityTypes.iterator().next();
> if (Objects.requireNonNull(next) == PERCENTAGE) {
> percentageIsSet = true;
> diagMsg.append("{Queue=").append(queue.getQueuePath()).append(",
> label=").append(nodeLabel)
> .append(" uses percentage mode}. ");
> } else if (next ==
> QueueCapacityVector.ResourceUnitCapacityType.ABSOLUTE) {
> absoluteMinResSet = true;
> diagMsg.append("{Queue=").append(queue.getQueuePath()).append(",
> label=").append(nodeLabel)
> .append(" uses absolute mode}. ");
> } else if (next ==
> QueueCapacityVector.ResourceUnitCapacityType.WEIGHT) {
> weightIsSet = true;
> diagMsg.append("{Queue=").append(queue.getQueuePath()).append(",
> label=").append(nodeLabel)
> .append(" uses weight mode}. ");
> }
> } else if (definedCapacityTypes.size() > 1) {
> mixedIsSet = true;
> diagMsg.append("{Queue=").append(queue.getQueuePath()).append(",
> label=").append(nodeLabel)
> .append(" uses mixed mode}. ");
> }
> }
> }
> {code}
> Pre capacityVector, we could utilise checkConfigTypeIsAbsoluteResource, e.g.:
> {code:java}
> - if
> (!queue.getQueueResourceQuotas().getConfiguredMinResource(nodeLabel)
> - .equals(Resources.none())) {
> + if (checkConfigTypeIsAbsoluteResource(queue.getQueuePath(),
> nodeLabel)) {
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]