Siddharth Ahuja created YARN-10528:
--------------------------------------
Summary: maxAMShare should only be accepted for leaf queues, not
parent queues
Key: YARN-10528
URL: https://issues.apache.org/jira/browse/YARN-10528
Project: Hadoop YARN
Issue Type: Bug
Reporter: Siddharth Ahuja
Based on [Hadoop
documentation|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html],
it is clear that {{maxAMShare}} property can only be used for *leaf queues*.
This is similar to the {{reservation}} setting.
However, existing code only ensures that the reservation setting is not
accepted for "parent" queues (see
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/allocation/AllocationFileQueueParser.java#L226
and
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/allocation/AllocationFileQueueParser.java#L233)
but it is missing the checks for {{maxAMShare}}. Due to this, it is current
possible to have an allocation similar to below:
{code}
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<allocations>
<queue name="root">
<weight>1.0</weight>
<schedulingPolicy>drf</schedulingPolicy>
<aclSubmitApps>*</aclSubmitApps>
<aclAdministerApps>*</aclAdministerApps>
<queue name="default">
<weight>1.0</weight>
<schedulingPolicy>drf</schedulingPolicy>
</queue>
<queue name="users" type="parent">
<weight>1.0</weight>
<schedulingPolicy>drf</schedulingPolicy>
<maxAMShare>1.0</maxAMShare>
</queue>
</queue>
<defaultQueueSchedulingPolicy>fair</defaultQueueSchedulingPolicy>
<queuePlacementPolicy>
<rule name="specified" create="true"/>
<rule name="nestedUserQueue" create="true">
<rule name="default" create="true" queue="users"/>
</rule>
<rule name="default"/>
</queuePlacementPolicy>
</allocations>
{code}
where {{maxAMShare}} is 1.0f meaning, it is possible allocate 100% of the
queue's resources for Application Masters. Notice above that root.users is a
parent queue, however, it still gladly accepts {{maxAMShare}}. This is contrary
to the documentation and in fact, it is very misleading because the child
queues like root.users.<user> actually do not inherit this setting at all and
they still go on and use the default of 0.5 instead of 1.0, see the attached
screenshot as an example.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]