[
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526900#comment-16526900
]
Mike Billau commented on YARN-8468:
-----------------------------------
Hi [~yufeigu], [~bsteinbach], and team - sorry for the delay.
The motivation behind this case came from one of my customers who has a very
large cluster and many different users. They are using FairScheduler and have
many different rules set up. Overall they are using
"yarn.scheduler.maximum-allocation-mb" to limit the size of containers that
their users create - this is to gently encourage the users to write "better"
jobs and not just request massive containers. This is working fine, except once
in a while they actually DO need to create massive containers for enterprise
jobs. Originally we were looking for ways to "exclude" these specific
enterprise jobs from this maximum-allocation-mb, but since this property is set
globally and applies to all queues, there was no way to do this. If we could
set this property at a per-queue basis we could achieve this.
Additionally, it looks like you CAN already set this maximum-allocation-mb
setting on a per queue basis for the CapacityScheduler, so this ticket would
add feature parity with the FairScheduler. Under queue properties for teh
CapacityScheduler doc page, we read:
"The per queue maximum limit of memory to allocate to each container request at
the Resource Manager. This setting overrides the cluster configuration
yarn.scheduler.maximum-allocation-mb. This value must be smaller than or equal
to the cluster maximum."
Hopefully that is enough justification - please let me know if you guys need
anything else! I don't have voting power but I agree that the naming scheme is
not friendly to newcomers.
> Limit container sizes per queue in FairScheduler
> ------------------------------------------------
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: resourcemanager
> Affects Versions: 3.1.0
> Reporter: Antal Bálint Steinbach
> Assignee: Antal Bálint Steinbach
> Priority: Critical
> Labels: patch
> Attachments: YARN-8468.000.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb"
> to limit the overall size of a container. This applies globally to all
> containers and cannot be limited by queue or and is not scheduler dependent.
>
> The goal of this ticket is to allow this value to be set on a per queue basis.
>
> The use case: User has two pools, one for ad hoc jobs and one for enterprise
> apps. User wants to limit ad hoc jobs to small containers but allow
> enterprise apps to request as many resources as needed. Setting
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum
> container size for all queues and setting maximum resources per queue with
> “maxContainerResources” queue config value.
>
> Suggested solution:
>
> All the infrastructure is already in the code. We need to do the following:
> * add the setting to the queue properties for all queue types (parent and
> leaf), this will cover dynamically created queues.
> * if we set it on the root we override the scheduler setting and we should
> not allow that.
> * make sure that queue resource cap can not be larger than scheduler max
> resource cap in the config.
> * implement getMaximumResourceCapability(String queueName) in the
> FairScheduler
> * implement getMaximumResourceCapability() in both FSParentQueue and
> FSLeafQueue as follows
> * expose the setting in the queue information in the RM web UI.
> * expose the setting in the metrics etc for the queue.
> * write JUnit tests.
> * update the scheduler documentation.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]