[ 
https://issues.apache.org/jira/browse/YARN-10496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17242686#comment-17242686
 ] 

Eric Payne commented on YARN-10496:
-----------------------------------

Thanks [~wangda] for putting this proposal together. I have a couple of 
comments.

First, I think option #1 would be the way to go. With option #1, it's clear 
whether you want percentages or weights, but with option #2, you lose the 
ability to check whether or not the percentages add up to 100%. For people 
coming from a FS perspective, this may not seem like a loss, but for admins 
used to CS, it is important for the CS bringup to check if you misconfigured 
the properites.
Also, with option #1, my guess is that the code will be more straightforward 
because once the weights are mapped to relative percentages, the calculations 
for user headroom, am limit, etc should remain the same.

For design option #1, I have a couple of concerns:
- From the design doc, one proposal is to define max capacity for weighted 
queues in terms of percentage of the cluster rather than percentage of the 
immediate parent. I would oppose this since max capacity in CS has always been 
in relative to the immediate parent.
- Proposal #1 recommends to support a different percentage/weight/value for 
each resource type (memory/vcores/GPUs/etc.). I feel like that is a major 
change and could affect the way that the DRC works in the CS, so I feel that if 
we decide to implement that feature, we should separate it out into it's own 
design, and possibly even separate it from this effort.


> [Umbrella] Support Flexible Auto Queue Creation in Capacity Scheduler
> ---------------------------------------------------------------------
>
>                 Key: YARN-10496
>                 URL: https://issues.apache.org/jira/browse/YARN-10496
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: capacity scheduler
>            Reporter: Wangda Tan
>            Priority: Major
>
> CapacityScheduler today doesn’t support an auto queue creation which is 
> flexible enough. The current constraints: 
>  * Only leaf queues can be auto-created
>  * A parent can only have either static queues or dynamic ones. This causes 
> multiple constraints. For example:
>  * It isn’t possible to have a VIP user like Alice with a static queue 
> root.user.alice with 50% capacity while the other user queues (under 
> root.user) are created dynamically and they share the remaining 50% of 
> resources.
>  
>  * In comparison, FairScheduler allows the following scenarios, Capacity 
> Scheduler doesn’t:
>  ** This implies that there is no possibility to have both dynamically 
> created and static queues at the same time under root
>  * A new queue needs to be created under an existing parent, while the parent 
> already has static queues
>  * Nested queue mapping policy, like in the following example: 
> |<rule name="nestedUserQueue" create=”true”>
>         <rule name="primaryGroup" create="true" />
> </rule>|
>  * Here two levels of queues may need to be created 
> If an application belongs to user _alice_ (who has the primary_group of 
> _engineering_), the scheduler checks whether _root.engineering_ exists, if it 
> doesn’t,  it’ll be created. Then scheduler checks whether 
> _root.engineering.alice_ exists, and creates it if it doesn't.
>  
> When we try to move users from FairScheduler to CapacityScheduler, we face 
> feature gaps which blocks users migrate from FS to CS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to