[ https://issues.apache.org/jira/browse/YARN-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355783#comment-17355783 ]
Andras Gyori commented on YARN-10796: ------------------------------------- Thanks [~pbacsko] for the patch and for creating a unit test for UsersManager. I have the same suggestion as [~bteke]. Apart from this, I have one concern regarding this change. It is going to be functionally different than before. I hate to suggest yet an other configuration property, because YARN is already heavily bloated, but: * This is going to change how zero capacity queues work. It might not be feasible for all users to allow zero capacity queues to allocate resources at all. * Also found, that there can be zero capacity static queues as well (see ParentQueue#allowZeroCapacitySum). That being said, probably a user would stop a queue in order to indicate, that it is temporarily not accepting any new submission. These are all speculations and I would not introduce yet an other property if it is not necessary. What is your opinion about it? > Capacity Scheduler: dynamic queue cannot scale out properly if its capacity > is 0% > --------------------------------------------------------------------------------- > > Key: YARN-10796 > URL: https://issues.apache.org/jira/browse/YARN-10796 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler > Reporter: Peter Bacsko > Assignee: Peter Bacsko > Priority: Major > Attachments: YARN-10796-001.patch, YARN-10796-002.patch > > > If we have a dynamic queue (AutoCreatedLeafQueue) with capacity = 0%, then it > cannot properly scale even if it's max-capacity and the parent's max-capacity > would allow it. > Example: > {noformat} > Cluster Capacity: 16 GB / 16cpu (2 nodes, each with 8 GB / 8 cpu ) > Container allocation size: 1G / 1 vcore > root.dynamic > Effective Capacity: <memory: 8192, vCores: 8> ( 50.0%) > Effective Max Capacity: <memory:16384, vCores:16> (100.0%) > Template: > Capacity: 40% > Max Capacity: 100% > User Limit Factor: 4 > {noformat} > leaf-queue-template.capacity = 40% > leaf-queue-template.maximum-capacity = 100% > leaf-queue-template.maximum-am-resource-percent = 50% > leaf-queue-template.minimum-user-limit-percent =100% > leaf-queue-template.user-limit-factor = 4 > "root.dynamic" has a maximum capacity of 100% and a capacity of 50%. > Let's assume there are running containers in these dynamic queues (MR sleep > jobs): > root.dynamic.user1 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user2 = 1 AM + 3 container (capacity = 40%) > root.dynamic.user3 = 1 AM + 15 container (capacity = 0%) > This scenario will result in an underutilized cluster. There will be approx > 18% unused capacity. On the other hand, it's still possible to submit a new > application to root.dynamic.user1 or root.dynamic.user2 and reaching a 100% > utilization is possible. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org