[
https://issues.apache.org/jira/browse/YARN-10821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363404#comment-17363404
]
Andras Gyori edited comment on YARN-10821 at 6/15/21, 7:14 AM:
---------------------------------------------------------------
Thanks [~epayne] for the detailed answer and the time taken to dive deep into
this issue. I agree with what you said, I have only used
getUserAMResourceLimitPerPartition as an example on how I think user limit
should be calculated. What you have already mentioned, the headroom calculation
is also using the userLimit, and that is why I am not sure whether this change
will wreak havoc in the calculation (though the unit tests already show that
there is an issue here).
The problem we have observed is inside
LeafQueue#getTotalPendingResourcesConsideringUserLimit, which aggregates all
the pending resources by all app in the queue. As far as I am concerned the
TempQueuePerPartition, which is used by preemption, relies on pending resources
to define how much resource should a queue get from preemption. However, due to
the headroom is calculated in this fashion, when only one active user was
present in the queue, the pending resource was only effective capacity * user
limit (instead of capacity * 1 / active users).
was (Author: gandras):
Thanks [~epayne] for the detailed answer. I agree with what you said, but I
have only used getUserAMResourceLimitPerPartition as an example on how I think
user limit should be calculated. What you have already mentioned, the headroom
calculation is also using the userLimit, and that is why I am not sure whether
this change will wreak havoc in the calculation (though the unit tests already
show that there is an issue here).
The problem we have observed is inside
LeafQueue#getTotalPendingResourcesConsideringUserLimit, which aggregates all
the pending resources by all app in the queue. As far as I am concerned the
TempQueuePerPartition, which is used by preemption, relies on pending resources
to define how much resource should a queue get from preemption. However, due to
the headroom is calculated in this fashion, when only one active user was
present in the queue, the pending resource was only effective capacity * user
limit (instead of capacity * 1 / active users).
> User limit is not calculated as per definition for preemption
> -------------------------------------------------------------
>
> Key: YARN-10821
> URL: https://issues.apache.org/jira/browse/YARN-10821
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacity scheduler
> Reporter: Andras Gyori
> Assignee: Andras Gyori
> Priority: Major
> Attachments: YARN-10821.001.patch
>
>
> Minimum user limit percent (MULP) is a soft limit by definition. Preemption
> uses pending resources to determine the resources needed by a queue, which is
> calculated in LeafQueue#getTotalPendingResourcesConsideringUserLimit. This
> method involves headroom calculated by UsersManager#computeUserLimit.
> However, the pending resources for preemption are limited in an unexpected
> fashion.
> * In LeafQueue#getUserAMResourceLimitPerPartition an effective userLimit is
> calculated first:
> {code:java}
> float effectiveUserLimit = Math.max(usersManager.getUserLimit() / 100.0f,
> 1.0f / Math.max(getAbstractUsersManager().getNumActiveUsers(), 1));
> {code}
> * In UsersManager#computeUserLimit the userLimit is calculated as is
> (currentCapacity * userLimit)
> {code:java}
> Resource userLimitResource = Resources.max(resourceCalculator,
> partitionResource,
> Resources.divideAndCeil(resourceCalculator, resourceUsed,
> usersSummedByWeight),
> Resources.divideAndCeil(resourceCalculator,
> Resources.multiplyAndRoundDown(currentCapacity, getUserLimit()),
> 100));
> {code}
> The fewer users occupying the queue, the more prevalent and outstanding this
> effect will be in preemption.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]