[
https://issues.apache.org/jira/browse/YARN-10821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17366823#comment-17366823
]
Eric Payne commented on YARN-10821:
-----------------------------------
[~gandras], can you please provide a step-by-step use case to reproduce the
problem you are encountering?
There were a lot of factors involved in the design of the user limit
calculations in {{UsersManager#computeUserLimit}}, and I am reluctant to change
them because they affect resource allocation as well as preemption. Some of the
background for the user-limit calculations can be found in YARN-5889.
It may be appropriate to modify
{{LeafQueue#getTotalPendingResourcesConsideringUserLimit}}, but I need to have
a better understanding of the use case. I am a little confused about what you
are seeing:
bq. What we have also observed, is that by turning off
minimum-user-limit-percent (setting it to 100), there was no issue. But when we
set MULP to say 50 percent, the queue have only been granted half of its
effective capacity, even though the other queue was using 600% of its effective
capacity and preemption should have kicked in.
If I understand the use case, this should not have happened. The user limit is
calculated as follows:
> User limit is not calculated as per definition for preemption
> -------------------------------------------------------------
>
> Key: YARN-10821
> URL: https://issues.apache.org/jira/browse/YARN-10821
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacity scheduler
> Reporter: Andras Gyori
> Assignee: Andras Gyori
> Priority: Major
> Attachments: YARN-10821.001.patch
>
>
> Minimum user limit percent (MULP) is a soft limit by definition. Preemption
> uses pending resources to determine the resources needed by a queue, which is
> calculated in LeafQueue#getTotalPendingResourcesConsideringUserLimit. This
> method involves headroom calculated by UsersManager#computeUserLimit.
> However, the pending resources for preemption are limited in an unexpected
> fashion.
> * In LeafQueue#getUserAMResourceLimitPerPartition an effective userLimit is
> calculated first:
> {code:java}
> float effectiveUserLimit = Math.max(usersManager.getUserLimit() / 100.0f,
> 1.0f / Math.max(getAbstractUsersManager().getNumActiveUsers(), 1));
> {code}
> * In UsersManager#computeUserLimit the userLimit is calculated as is
> (currentCapacity * userLimit)
> {code:java}
> Resource userLimitResource = Resources.max(resourceCalculator,
> partitionResource,
> Resources.divideAndCeil(resourceCalculator, resourceUsed,
> usersSummedByWeight),
> Resources.divideAndCeil(resourceCalculator,
> Resources.multiplyAndRoundDown(currentCapacity, getUserLimit()),
> 100));
> {code}
> The fewer users occupying the queue, the more prevalent and outstanding this
> effect will be in preemption.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]