[ 
https://issues.apache.org/jira/browse/YARN-9569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16854763#comment-16854763
 ] 

Suma Shivaprasad commented on YARN-9569:
----------------------------------------

+1. Patch LGTM

> Auto-created leaf queues do not honor cluster-wide min/max memory/vcores
> ------------------------------------------------------------------------
>
>                 Key: YARN-9569
>                 URL: https://issues.apache.org/jira/browse/YARN-9569
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacity scheduler
>    Affects Versions: 3.2.0
>            Reporter: Craig Condit
>            Assignee: Craig Condit
>            Priority: Major
>         Attachments: YARN-9569.001.patch, YARN-9569.002.patch
>
>
> Auto-created leaf queues do not honor cluster-wide settings for maximum 
> CPU/vcores allocation.
> To reproduce:
>  # Set auto-create-child-queue.enabled=true for a parent queue.
>  # Set leaf-queue-template.maximum-allocation-mb=16384.
>  # Set yarn.resource-types.memory-mb.maximum-allocation=16384 in 
> resource-types.xml
>  # Launch a YARN app with a container requesting 16 GB RAM.
>  
> This scenario should work, but instead you get an error similar to this:
> {code:java}
> java.lang.IllegalArgumentException: Queue maximum allocation cannot be larger 
> than the cluster setting for queue root.auto.test max allocation per queue: 
> <memory:16384, vCores:1> cluster setting: <memory:8192, vCores:4>   {code}
>  
> This seems to be caused by this code in 
> ManagedParentQueue.getLeafQueueConfigs:
> {code:java}
> CapacitySchedulerConfiguration leafQueueConfigTemplate = new
>     CapacitySchedulerConfiguration(new Configuration(false), false);{code}
>  
> This initializes a new leaf queue configuration that does not read 
> resource-types.xml (or any other config). Later, this 
> CapacitySchedulerConfiguration instance calls 
> ResourceUtils.fetchMaximumAllocationFromConfig()  from its 
> getMaximumAllocationPerQueue() method and passes itself as the configuration 
> to use. Since the resource types are not present, ResourceUtils falls back to 
> compiled-in defaults of 8GB RAM, 4 cores.
>  
> I was able to work around this with a custom AutoCreatedQueueManagementPolicy 
> implementation which does something like this in init() and reinitialize():
> {code:java}
> for (Map.Entry<String, String> entry : this.scheduler.getConfiguration()) {
> if (entry.getKey().startsWith("yarn.resource-types")) {
>   parentQueue.getLeafQueueTemplate().getLeafQueueConfigs()
>     .set(entry.getKey(), entry.getValue());
>   }
> }
> {code}
> However, this is obviously a very hacky way to solve the problem.
> I can submit a proper patch if someone can provide some direction as to the 
> best way to proceed.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to