[jira] [Commented] (YARN-7424) Capacity Scheduler Intra-queue preemption: add property to only preempt up to configured MULP

Eric Payne (JIRA) Tue, 31 Oct 2017 13:52:33 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-7424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227528#comment-16227528
 ]


Eric Payne commented on YARN-7424:
----------------------------------

In a large, multi-tenant queue with MULP of 1%, after instrumenting intra-queue 
preemption, we have discovered that enabling both inter-queue and intra-queue 
preemption causes an order of magnitude more lost work than enabling only 
inter-queue preemption alone. Even after reducing 
{{intra-queue-preemption.max-allowable-limit}} from 20% (default) to 3%, the 
lost work is still several times more than with just inter-queue alone.

| | *MemSeconds Lost* |
| *Only inter-queue preemption enabled* | {{LostCrossQueueMemSec}} |
| *Both inter- and intra-queue preemption enabled with 20% max-allocaion-limit* 
| {{12.7824 * LostCrossQueueMemSec}} |
| *Both inter- and intra-queue preemption enabled with 3% max-allocaion-limit* 
| {{7.9893 * LostCrossQueueMemSec}} |

| | *Vcoreseconds Lost* |
| *Only inter-queue preemption enabled* | {{LostCrossQueueVSec}} |
| *Both inter- and intra-queue preemption enabled with 20% max-allocaion-limit* 
| {{26.1885 * LostCrossQueueVSec}} |
| *Both inter- and intra-queue preemption enabled with 3% max-allocaion-limit* 
| {{19.2676 * LostCrossQueueVSec}} |

It is expected that turning on intra-queue preemption would increase the number 
of preemptions. However, an order of magnituded more seems excessive. Also, 
reducing {{intra-queue-preemption.max-allowable-limit}} didn't have nearly the 
effect I thought it should.

I think there is an underlying design philosophy that should be addressed.

The current intra-queue preemption design balances the user limit among all of 
the users. This calculation is based on the total queue capacity and the number 
of users in the queue. In a very large queue with a large number of active 
users, the number of users in the queue is constantly changing. Also, if the 
node overcommit feature is enabled, the total size of the queue will change as 
well when the cluster becomes very busy. The result is that preemption must 
constantly happen in order to balance all of the users.

For this reason, we need a configuration property that stops preempting on 
behalf of a user once the user is above the MULP, which is a stable value. As a 
variation, we may want to have a "live zone" of MULP plus some configurable 
value.


> Capacity Scheduler Intra-queue preemption: add property to only preempt up to 
> configured MULP
> ---------------------------------------------------------------------------------------------
>
>                 Key: YARN-7424
>                 URL: https://issues.apache.org/jira/browse/YARN-7424
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacity scheduler, scheduler preemption
>    Affects Versions: 3.0.0-beta1, 2.8.2
>            Reporter: Eric Payne
>
> If the queue's configured minimum user limit percent (MULP) is something 
> small like 1%, all users will max out well over their MULP until 100 users 
> have apps in the queue. Since the intra-queue preemption monitor tries to 
> balance the resource among the users, most of the time in this use case it 
> will be preempting containers on behalf of users that are already over their 
> MULP guarantee.
> This JIRA proposes that a property should be provided so that a queue can be 
> configured to only preempt on behalf of a user until that user has reached 
> its MULP.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-7424) Capacity Scheduler Intra-queue preemption: add property to only preempt up to configured MULP

Reply via email to