[ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646964#comment-14646964 ]
Wangda Tan commented on YARN-3945: ---------------------------------- Thanks for summarizing [~Naganarasimha]. I think we *might* need to reconsider user-limit / user-limit-factor configuration. I can also see it's hard to be understood: - User-limit is not a lower bound nor higher bound. - User-limit is not a fairness mechanism to balance resources between users, instead, it can lead to bad imbalance. One example is, if we set user-limit = 50, and there're 10 users running, we cannot manage how much resource can be used by each user. - It's really hard to understand, I spent time working on CapacityScheduler almost everyday, but sometimes I will forget and need to look at code to see how it is computed. :-(. Basically User-limit is computed by: {{user-limit = {{min(queue-capacity * user-limit-factor, current-capacity * max(user-limit / 100, 1 / #active-user)}}. But this formula is not that meaningful since #active-user is changing every minute, it is not a predictable formula. Instead we may need to consider some notion like fair sharing: user-limit-factor becomes max-resource-limit of each user, and user-limit-percentage becomes something like guaranteed-concurrent-#user, when #user > guaranteed-concurrent-#user, rest users can only get idle shares. With this approach, and considering we have user-limit-preemption within a queue (YARN-2113), we can get a predictable user-limit. Thoughts? [~nroberts], [~jlowe]. > maxApplicationsPerUser is wrongly calculated > -------------------------------------------- > > Key: YARN-3945 > URL: https://issues.apache.org/jira/browse/YARN-3945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler > Affects Versions: 2.7.1 > Reporter: Naganarasimha G R > Assignee: Naganarasimha G R > Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch > > > maxApplicationsPerUser is currently calculated based on the formula > {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * > userLimitFactor)}} but description of userlimit is > {quote} > Each queue enforces a limit on the percentage of resources allocated to a > user at any given time, if there is demand for resources. The user limit can > vary between a minimum and maximum value.{color:red} The the former (the > minimum value) is set to this property value {color} and the latter (the > maximum value) depends on the number of users who have submitted > applications. For e.g., suppose the value of this property is 25. If two > users have submitted applications to a queue, no single user can use more > than 50% of the queue resources. If a third user submits an application, no > single user can use more than 33% of the queue resources. With 4 or more > users, no user can use more than 25% of the queues resources. A value of 100 > implies no user limits are imposed. The default is 100. Value is specified as > a integer. > {quote} > configuration related to minimum limit should not be made used in a formula > to calculate max applications for a user -- This message was sent by Atlassian JIRA (v6.3.4#6332)