[ 
https://issues.apache.org/jira/browse/YARN-10821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17366823#comment-17366823
 ] 

Eric Payne commented on YARN-10821:
-----------------------------------

[~gandras], can you please provide a step-by-step use case to reproduce the 
problem you are encountering?

There were a lot of factors involved in the design of the user limit 
calculations in {{UsersManager#computeUserLimit}}, and I am reluctant to change 
them because they affect resource allocation as well as preemption. Some of the 
background for the user-limit calculations can be found in YARN-5889.

It may be appropriate to modify 
{{LeafQueue#getTotalPendingResourcesConsideringUserLimit}}, but I need to have 
a better understanding of the use case. I am a little confused about what you 
are seeing:
bq. What we have also observed, is that by turning off 
minimum-user-limit-percent (setting it to 100), there was no issue. But when we 
set MULP to say 50 percent, the queue have only been granted half of its 
effective capacity, even though the other queue was using 600% of its effective 
capacity and preemption should have kicked in.
If I understand the use case, this should not have happened. The user limit is 
calculated as follows:

> User limit is not calculated as per definition for preemption
> -------------------------------------------------------------
>
>                 Key: YARN-10821
>                 URL: https://issues.apache.org/jira/browse/YARN-10821
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>            Reporter: Andras Gyori
>            Assignee: Andras Gyori
>            Priority: Major
>         Attachments: YARN-10821.001.patch
>
>
> Minimum user limit percent (MULP) is a soft limit by definition. Preemption 
> uses pending resources to determine the resources needed by a queue, which is 
> calculated in LeafQueue#getTotalPendingResourcesConsideringUserLimit. This 
> method involves headroom calculated by UsersManager#computeUserLimit. 
> However, the pending resources for preemption are limited in an unexpected 
> fashion.
>  * In LeafQueue#getUserAMResourceLimitPerPartition an effective userLimit is 
> calculated first:
> {code:java}
>  float effectiveUserLimit = Math.max(usersManager.getUserLimit() / 100.0f,
>  1.0f / Math.max(getAbstractUsersManager().getNumActiveUsers(), 1));
> {code}
>  * In UsersManager#computeUserLimit the userLimit is calculated as is 
> (currentCapacity * userLimit)
> {code:java}
>  Resource userLimitResource = Resources.max(resourceCalculator,
>  partitionResource,
>  Resources.divideAndCeil(resourceCalculator, resourceUsed,
>  usersSummedByWeight),
>  Resources.divideAndCeil(resourceCalculator,
>  Resources.multiplyAndRoundDown(currentCapacity, getUserLimit()),
>  100));
> {code}
> The fewer users occupying the queue, the more prevalent and outstanding this 
> effect will be in preemption.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to