[
https://issues.apache.org/jira/browse/YARN-10496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17242686#comment-17242686
]
Eric Payne commented on YARN-10496:
-----------------------------------
Thanks [~wangda] for putting this proposal together. I have a couple of
comments.
First, I think option #1 would be the way to go. With option #1, it's clear
whether you want percentages or weights, but with option #2, you lose the
ability to check whether or not the percentages add up to 100%. For people
coming from a FS perspective, this may not seem like a loss, but for admins
used to CS, it is important for the CS bringup to check if you misconfigured
the properites.
Also, with option #1, my guess is that the code will be more straightforward
because once the weights are mapped to relative percentages, the calculations
for user headroom, am limit, etc should remain the same.
For design option #1, I have a couple of concerns:
- From the design doc, one proposal is to define max capacity for weighted
queues in terms of percentage of the cluster rather than percentage of the
immediate parent. I would oppose this since max capacity in CS has always been
in relative to the immediate parent.
- Proposal #1 recommends to support a different percentage/weight/value for
each resource type (memory/vcores/GPUs/etc.). I feel like that is a major
change and could affect the way that the DRC works in the CS, so I feel that if
we decide to implement that feature, we should separate it out into it's own
design, and possibly even separate it from this effort.
> [Umbrella] Support Flexible Auto Queue Creation in Capacity Scheduler
> ---------------------------------------------------------------------
>
> Key: YARN-10496
> URL: https://issues.apache.org/jira/browse/YARN-10496
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: capacity scheduler
> Reporter: Wangda Tan
> Priority: Major
>
> CapacityScheduler today doesn’t support an auto queue creation which is
> flexible enough. The current constraints:
> * Only leaf queues can be auto-created
> * A parent can only have either static queues or dynamic ones. This causes
> multiple constraints. For example:
> * It isn’t possible to have a VIP user like Alice with a static queue
> root.user.alice with 50% capacity while the other user queues (under
> root.user) are created dynamically and they share the remaining 50% of
> resources.
>
> * In comparison, FairScheduler allows the following scenarios, Capacity
> Scheduler doesn’t:
> ** This implies that there is no possibility to have both dynamically
> created and static queues at the same time under root
> * A new queue needs to be created under an existing parent, while the parent
> already has static queues
> * Nested queue mapping policy, like in the following example:
> |<rule name="nestedUserQueue" create=”true”>
> <rule name="primaryGroup" create="true" />
> </rule>|
> * Here two levels of queues may need to be created
> If an application belongs to user _alice_ (who has the primary_group of
> _engineering_), the scheduler checks whether _root.engineering_ exists, if it
> doesn’t, it’ll be created. Then scheduler checks whether
> _root.engineering.alice_ exists, and creates it if it doesn't.
>
> When we try to move users from FairScheduler to CapacityScheduler, we face
> feature gaps which blocks users migrate from FS to CS.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]