[ 
https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646964#comment-14646964
 ] 

Wangda Tan commented on YARN-3945:
----------------------------------

Thanks for summarizing [~Naganarasimha]. I think we *might* need to reconsider 
user-limit / user-limit-factor configuration. I can also see it's hard to be 
understood:
- User-limit is not a lower bound nor higher bound.
- User-limit is not a fairness mechanism to balance resources between users, 
instead, it can lead to bad imbalance. One example is, if we set user-limit = 
50, and there're 10 users running, we cannot manage how much resource can be 
used by each user.
- It's really hard to understand, I spent time working on CapacityScheduler 
almost everyday, but sometimes I will forget and need to look at code to see 
how it is computed. :-(.
Basically User-limit is computed by:
{{user-limit = {{min(queue-capacity * user-limit-factor, current-capacity * 
max(user-limit / 100, 1 / #active-user)}}. But this formula is not that 
meaningful since #active-user is changing every minute, it is not a predictable 
formula.

Instead we may need to consider some notion like fair sharing: 
user-limit-factor becomes max-resource-limit of each user, and 
user-limit-percentage becomes something like guaranteed-concurrent-#user, when 
#user > guaranteed-concurrent-#user, rest users can only get idle shares.

With this approach, and considering we have user-limit-preemption within a 
queue (YARN-2113), we can get a predictable user-limit.

Thoughts? [~nroberts], [~jlowe].

> maxApplicationsPerUser is wrongly calculated
> --------------------------------------------
>
>                 Key: YARN-3945
>                 URL: https://issues.apache.org/jira/browse/YARN-3945
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 2.7.1
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>         Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch
>
>
> maxApplicationsPerUser is currently calculated based on the formula
> {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * 
> userLimitFactor)}} but description of userlimit is 
> {quote}
> Each queue enforces a limit on the percentage of resources allocated to a 
> user at any given time, if there is demand for resources. The user limit can 
> vary between a minimum and maximum value.{color:red} The the former (the 
> minimum value) is set to this property value {color} and the latter (the 
> maximum value) depends on the number of users who have submitted 
> applications. For e.g., suppose the value of this property is 25. If two 
> users have submitted applications to a queue, no single user can use more 
> than 50% of the queue resources. If a third user submits an application, no 
> single user can use more than 33% of the queue resources. With 4 or more 
> users, no user can use more than 25% of the queues resources. A value of 100 
> implies no user limits are imposed. The default is 100. Value is specified as 
> a integer.
> {quote}
> configuration related to minimum limit should not be made used in a formula 
> to calculate max applications for a user



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to