[
https://issues.apache.org/jira/browse/YARN-7424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227528#comment-16227528
]
Eric Payne commented on YARN-7424:
----------------------------------
In a large, multi-tenant queue with MULP of 1%, after instrumenting intra-queue
preemption, we have discovered that enabling both inter-queue and intra-queue
preemption causes an order of magnitude more lost work than enabling only
inter-queue preemption alone. Even after reducing
{{intra-queue-preemption.max-allowable-limit}} from 20% (default) to 3%, the
lost work is still several times more than with just inter-queue alone.
| | *MemSeconds Lost* |
| *Only inter-queue preemption enabled* | {{LostCrossQueueMemSec}} |
| *Both inter- and intra-queue preemption enabled with 20% max-allocaion-limit*
| {{12.7824 * LostCrossQueueMemSec}} |
| *Both inter- and intra-queue preemption enabled with 3% max-allocaion-limit*
| {{7.9893 * LostCrossQueueMemSec}} |
| | *Vcoreseconds Lost* |
| *Only inter-queue preemption enabled* | {{LostCrossQueueVSec}} |
| *Both inter- and intra-queue preemption enabled with 20% max-allocaion-limit*
| {{26.1885 * LostCrossQueueVSec}} |
| *Both inter- and intra-queue preemption enabled with 3% max-allocaion-limit*
| {{19.2676 * LostCrossQueueVSec}} |
It is expected that turning on intra-queue preemption would increase the number
of preemptions. However, an order of magnituded more seems excessive. Also,
reducing {{intra-queue-preemption.max-allowable-limit}} didn't have nearly the
effect I thought it should.
I think there is an underlying design philosophy that should be addressed.
The current intra-queue preemption design balances the user limit among all of
the users. This calculation is based on the total queue capacity and the number
of users in the queue. In a very large queue with a large number of active
users, the number of users in the queue is constantly changing. Also, if the
node overcommit feature is enabled, the total size of the queue will change as
well when the cluster becomes very busy. The result is that preemption must
constantly happen in order to balance all of the users.
For this reason, we need a configuration property that stops preempting on
behalf of a user once the user is above the MULP, which is a stable value. As a
variation, we may want to have a "live zone" of MULP plus some configurable
value.
> Capacity Scheduler Intra-queue preemption: add property to only preempt up to
> configured MULP
> ---------------------------------------------------------------------------------------------
>
> Key: YARN-7424
> URL: https://issues.apache.org/jira/browse/YARN-7424
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: capacity scheduler, scheduler preemption
> Affects Versions: 3.0.0-beta1, 2.8.2
> Reporter: Eric Payne
>
> If the queue's configured minimum user limit percent (MULP) is something
> small like 1%, all users will max out well over their MULP until 100 users
> have apps in the queue. Since the intra-queue preemption monitor tries to
> balance the resource among the users, most of the time in this use case it
> will be preempting containers on behalf of users that are already over their
> MULP guarantee.
> This JIRA proposes that a property should be provided so that a queue can be
> configured to only preempt on behalf of a user until that user has reached
> its MULP.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]