Tamas Domok created YARN-11641: ---------------------------------- Summary: Can't update a queue hierarchy in absolute mode when the configured capacities are zero Key: YARN-11641 URL: https://issues.apache.org/jira/browse/YARN-11641 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.4.0 Reporter: Tamas Domok Assignee: Tamas Domok
h2. Error symptoms It is not possible to modify a queue hierarchy in absolute mode when the parent or every child queue of the parent has 0 min resource configured. {noformat} 2024-01-05 15:38:59,016 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager: Initialized queue: root.a.c 2024-01-05 15:38:59,016 ERROR org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices: Exception thrown when modifying configuration. java.io.IOException: Failed to re-init queues : Parent=root.a: When absolute minResource is used, we must make sure both parent and child all use absolute minResource {noformat} h2. Reproduction capacity-scheduler.xml {code:xml} <?xml version="1.0"?> <configuration> <property> <name>yarn.scheduler.capacity.root.queues</name> <value>default,a</value> </property> <property> <name>yarn.scheduler.capacity.root.capacity</name> <value>[memory=40960, vcores=16]</value> </property> <property> <name>yarn.scheduler.capacity.root.default.capacity</name> <value>[memory=1024, vcores=1]</value> </property> <property> <name>yarn.scheduler.capacity.root.default.maximum-capacity</name> <value>[memory=1024, vcores=1]</value> </property> <property> <name>yarn.scheduler.capacity.root.a.capacity</name> <value>[memory=0, vcores=0]</value> </property> <property> <name>yarn.scheduler.capacity.root.a.maximum-capacity</name> <value>[memory=39936, vcores=15]</value> </property> <property> <name>yarn.scheduler.capacity.root.a.queues</name> <value>b,c</value> </property> <property> <name>yarn.scheduler.capacity.root.a.b.capacity</name> <value>[memory=0, vcores=0]</value> </property> <property> <name>yarn.scheduler.capacity.root.a.b.maximum-capacity</name> <value>[memory=39936, vcores=15]</value> </property> <property> <name>yarn.scheduler.capacity.root.a.c.capacity</name> <value>[memory=0, vcores=0]</value> </property> <property> <name>yarn.scheduler.capacity.root.a.c.maximum-capacity</name> <value>[memory=39936, vcores=15]</value> </property> </configuration> {code} {code:xml} <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <sched-conf> <update-queue> <queue-name>root.a</queue-name> <params> <entry> <key>capacity</key> <value>[memory=1024,vcores=1]</value> </entry> <entry> <key>maximum-capacity</key> <value>[memory=39936,vcores=15]</value> </entry> </params> </update-queue> </sched-conf> {code} {code} $ curl -X PUT -H 'Content-Type: application/xml' -d @updatequeue.xml http://localhost:8088/ws/v1/cluster/scheduler-conf\?user.name\=yarn Failed to re-init queues : Parent=root.a: When absolute minResource is used, we must make sure both parent and child all use absolute minResource {code} h2. Root cause setChildQueues is called during reinit, where: {code:java} void setChildQueues(Collection<CSQueue> childQueues) throws IOException { writeLock.lock(); try { boolean isLegacyQueueMode = queueContext.getConfiguration().isLegacyQueueMode(); if (isLegacyQueueMode) { QueueCapacityType childrenCapacityType = getCapacityConfigurationTypeForQueues(childQueues); QueueCapacityType parentCapacityType = getCapacityConfigurationTypeForQueues(ImmutableList.of(this)); if (childrenCapacityType == QueueCapacityType.ABSOLUTE_RESOURCE || parentCapacityType == QueueCapacityType.ABSOLUTE_RESOURCE) { // We don't allow any mixed absolute + {weight, percentage} between // children and parent if (childrenCapacityType != parentCapacityType && !this.getQueuePath() .equals(CapacitySchedulerConfiguration.ROOT)) { throw new IOException("Parent=" + this.getQueuePath() + ": When absolute minResource is used, we must make sure both " + "parent and child all use absolute minResource"); } {code} The parent or childrenCapacityType will be considered as PERCENTAGE, because getCapacityConfigurationTypeForQueues fails to detect the absolute mode, here: {code:java} if (!queue.getQueueResourceQuotas().getConfiguredMinResource(nodeLabel) .equals(Resources.none())) { absoluteMinResSet = true; {code} h2. Possible fixes Possible fix in AbstractParentQueue.getCapacityConfigurationTypeForQueues using the capacityVector: {code:java} for (CSQueue queue : queues) { for (String nodeLabel : queueCapacities.getExistingNodeLabels()) { Set<QueueCapacityVector.ResourceUnitCapacityType> definedCapacityTypes = queue.getConfiguredCapacityVector(nodeLabel).getDefinedCapacityTypes(); if (definedCapacityTypes.size() == 1) { QueueCapacityVector.ResourceUnitCapacityType next = definedCapacityTypes.iterator().next(); if (Objects.requireNonNull(next) == PERCENTAGE) { percentageIsSet = true; diagMsg.append("{Queue=").append(queue.getQueuePath()).append(", label=").append(nodeLabel) .append(" uses percentage mode}. "); } else if (next == QueueCapacityVector.ResourceUnitCapacityType.ABSOLUTE) { absoluteMinResSet = true; diagMsg.append("{Queue=").append(queue.getQueuePath()).append(", label=").append(nodeLabel) .append(" uses absolute mode}. "); } else if (next == QueueCapacityVector.ResourceUnitCapacityType.WEIGHT) { weightIsSet = true; diagMsg.append("{Queue=").append(queue.getQueuePath()).append(", label=").append(nodeLabel) .append(" uses weight mode}. "); } } else if (definedCapacityTypes.size() > 1) { mixedIsSet = true; diagMsg.append("{Queue=").append(queue.getQueuePath()).append(", label=").append(nodeLabel) .append(" uses mixed mode}. "); } } } {code} Pre capacityVector, we could utilise checkConfigTypeIsAbsoluteResource, e.g.: {code:java} - if (!queue.getQueueResourceQuotas().getConfiguredMinResource(nodeLabel) - .equals(Resources.none())) { + if (checkConfigTypeIsAbsoluteResource(queue.getQueuePath(), nodeLabel)) { {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org