[ 
https://issues.apache.org/jira/browse/YARN-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16382847#comment-16382847
 ] 

Eric Payne commented on YARN-4781:
----------------------------------

A lot has happened since this JIRA was opened, but I think there is still value 
in pursuing the original intent. That is, intra-queue preemption should 
consider FairOrderingPolicy.
{quote}Currently, if a job in queue A is using 100% of the cluster resources, 
and a new job arrives in queue A, it sometimes cannot even get an application 
master!
{quote}
{quote}one big query is taking all resources of a queue lets say Q1. And when i 
am launching another query in Q1, almost always it is hanging in ACCEPTED
{quote}
[~milesc] and [~anuptiwari], I think this use case is covered by YARN-2009 and 
related JIRAs. I think this JIRA covers a slightly different use case.

FairOrderingPolicy tries to evenly assign containers across users and across 
apps within a user (as long as the user is below the user limit). Currently, 
the FairOrderingPolicy does not honor application priority AFAICT.

We have seen the following use case in a large and extremely busy queue where 
we have FairOrderingPolicy set, one user takes up a lot of the queue, and then 
other, later users, fight for the remaining resources, with the youngest users 
/ apps getting constantly preempted while the larger, older user is not 
preempted.

For example,
 QueueA: minimum-user-limit-percent = 25
 QueueA: resources = 1000
| |Used|Pending|Preempted|
|User1 / App1|400|0|0|
|User2 / App2|300|0|0|
|User3 / App3|300|0|0|
|User4 / App4|0|100|0|
 - Intra-queue preemption preempts 50 from App2 and 50 from App3.

| |Used|Pending|Preempted|
|User1 / App1|400|0|0|
|User2 / App2|250|0|50|
|User3 / App3|250|0|50|
|User4 / App4|100|0|0|
 - App3 finishes and resources are given back to App2 and App3.

| |Used|Pending|Preempted|
|User1 / App1|400|0|0|
|User2 / App2|300|0|50|
|User3 / App3|300|0|50|
 - Then, User4 submits App5, and the process repeates.

| |Used|Pending|Preempted|
|User1 / App1|400|0|0|
|User2 / App2|250|0|100|
|User3 / App3|250|0|100|
|User4 / App5|100|0|0|

Then, while all 4 users have running apps, User5 comes along and can't get any 
resources, they see that User1 is using 62% more resources than everyone else, 
and wonders why they can't get any resources. (yes, I recognize the reason in 
this case is because MULP = 25%, but I'm trying to make the user case simple).

This is somewhat simplified because in our case, we have up to 50 active users, 
and since the queue is large, the difference between the largest user and the 
others is even more apparent.

> Support intra-queue preemption for fairness ordering policy.
> ------------------------------------------------------------
>
>                 Key: YARN-4781
>                 URL: https://issues.apache.org/jira/browse/YARN-4781
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: scheduler
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>            Priority: Major
>
> We introduced fairness queue policy since YARN-3319, which will let large 
> applications make progresses and not starve small applications. However, if a 
> large application takes the queue’s resources, and containers of the large 
> app has long lifespan, small applications could still wait for resources for 
> long time and SLAs cannot be guaranteed.
> Instead of wait for application release resources on their own, we need to 
> preempt resources of queue with fairness policy enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to