[
https://issues.apache.org/jira/browse/YARN-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Prabhu Joseph resolved YARN-4730.
---------------------------------
Resolution: Duplicate
YARN-2026
> YARN preemption based on instantaneous fair share
> -------------------------------------------------
>
> Key: YARN-4730
> URL: https://issues.apache.org/jira/browse/YARN-4730
> Project: Hadoop YARN
> Issue Type: Bug
> Components: fairscheduler
> Reporter: Prabhu Joseph
>
> On a big cluster with Total Cluster Resource of 10TB, 3000 cores and Fair
> Sheduler having 230 queues and total 60000 jobs run a day. [ all 230 queues
> are very critical and hence the minResource is same for all]. On this case,
> when a Spark Job is run on queue A and which occupies the entire cluster
> resource and does not release any resource, another job submitted into queue
> B and preemption is getting only the Fair Share which is <10TB , 3000> / 230
> = <45 GB , 13 cores> which is very less fair share for a queue.shared by many
> applications.
> The Preemption should get the instantaneous fair Share, that is <10TB, 3000>
> / 2 (active queues) = 5TB and 1500 cores, so that the first job won't hog the
> entire cluster resource and also the subsequent jobs run fine.
> This issue is only when the number of queues are very high. In case of less
> number of queues, Preemption getting Fair Share would be suffice as the fair
> share will be high. But in case of too many number of queues, Preemption
> should try to get the instantaneous Fair Share.
> Note: Configuring optimal maxResources to 230 queues is difficult and also
> putting constraint for the queues using maxResource will leave cluster
> resource idle most of the time.
> There are 1000s of Spark Jobs, so asking each user to restrict the
> number of executors is also difficult.
> Preempting Instantaneous Fair Share will help to overcome the above issues.
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)