[ 
https://issues.apache.org/jira/browse/YARN-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355783#comment-17355783
 ] 

Andras Gyori commented on YARN-10796:
-------------------------------------

Thanks [~pbacsko] for the patch and for creating a unit test for UsersManager. 
I have the same suggestion as [~bteke]. Apart from this, I have one concern 
regarding this change. It is going to be functionally different than before. I 
hate to suggest yet an other configuration property, because YARN is already 
heavily bloated, but:
 * This is going to change how zero capacity queues work. It might not be 
feasible for all users to allow zero capacity queues to allocate resources at 
all.
 * Also found, that there can be zero capacity static queues as well (see 
ParentQueue#allowZeroCapacitySum). 

That being said, probably a user would stop a queue in order to indicate, that 
it is temporarily not accepting any new submission. These are all speculations 
and I would not introduce yet an other property if it is not necessary. What is 
your opinion about it?

> Capacity Scheduler: dynamic queue cannot scale out properly if its capacity 
> is 0%
> ---------------------------------------------------------------------------------
>
>                 Key: YARN-10796
>                 URL: https://issues.apache.org/jira/browse/YARN-10796
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler, capacityscheduler
>            Reporter: Peter Bacsko
>            Assignee: Peter Bacsko
>            Priority: Major
>         Attachments: YARN-10796-001.patch, YARN-10796-002.patch
>
>
> If we have a dynamic queue (AutoCreatedLeafQueue) with capacity = 0%, then it 
> cannot properly scale even if it's max-capacity and the parent's max-capacity 
> would allow it.
> Example:
> {noformat}
> Cluster Capacity:  16 GB / 16cpu (2 nodes, each with 8 GB / 8 cpu )
> Container allocation size: 1G / 1 vcore
> root.dynamic 
>     Effective Capacity:      <memory: 8192, vCores: 8> ( 50.0%)
>     Effective Max Capacity:  <memory:16384, vCores:16> (100.0%) 
>     Template:
>         Capacity:               40%
>         Max Capacity:           100%
>         User Limit Factor:      4
>  {noformat}
> leaf-queue-template.capacity = 40%
>  leaf-queue-template.maximum-capacity = 100%
>  leaf-queue-template.maximum-am-resource-percent = 50%
>  leaf-queue-template.minimum-user-limit-percent =100%
>  leaf-queue-template.user-limit-factor = 4
> "root.dynamic" has a maximum capacity of 100% and a capacity of 50%.
> Let's assume there are running containers in these dynamic queues (MR sleep 
> jobs):
>  root.dynamic.user1 = 1 AM + 3 container (capacity = 40%)
>  root.dynamic.user2 = 1 AM + 3 container (capacity = 40%)
>  root.dynamic.user3 = 1 AM + 15 container (capacity = 0%)
> This scenario will result in an underutilized cluster. There will be approx 
> 18% unused capacity. On the other hand, it's still possible to submit a new 
> application to root.dynamic.user1 or root.dynamic.user2 and reaching a 100% 
> utilization is possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to