Jason Lowe commented on YARN-2004:

For your first scenario, it can happen today without priority.  MR jobs ask for 
resources in waves -- first all the maps, then over time it ramps up reducers.  
Multiple jobs in the same queue from the same user can collide in different 
phases.  That's the whole point of the headroom calculation and reporting -- to 
allow AMs to realize this scenario is happening and react to it.  In this case 
what will happen is j1 will see its headroom is zero and start killing reducers 
to make room for the failed map task.  After killing the reducers there will be 
some free resources in the cluster (if they weren't stolen by another, 
underserved queue).  Then the question goes to who will get those resources.  
If we're using the default priority, j1 will get first crack at them due to 
FIFO priority.  If j2 or j3 were made higher priority then j1 will see that its 
headroom is _still_ zero after killing some reducers and will probably kill 
some more to try to make room.  Rinse, repeat until j1 is out of reducers to 
shoot or gets the resources it needs to run the failed map.

For the second scenario, the 5th user will _still_ be the first one to get any 
spare resources in the queue because he has the highest priority app.  Note 
that the user limit calculation does not involve comparing a user's current 
limit with other user's usage.  It's just a computation of what's available in 
the queue and what you're allowed based on the configured user limit and user 
limit factor.  So what will happen is the 5th user will continue to consume any 
free resources in the queue until either the app is satiated or the 5th user 
hits the 25% cap.  If there are no free resources then the 5th user's app will 
starve (without preemption) just like the rest until resources show up.  Again, 
higher priority just means you're first in line to get resources when they are 
freed up, and it doesn't change anything else.

We can discuss adding preemption into the mix to force higher priority apps to 
get their requested resources faster in a full queue.  However I think the 
first step is to get priority scheduling working for resources that are free in 
the queue in the non-preemption case, as that's still very useful in practice.

> Priority scheduling support in Capacity scheduler
> -------------------------------------------------
>                 Key: YARN-2004
>                 URL: https://issues.apache.org/jira/browse/YARN-2004
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacityscheduler
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: 0001-YARN-2004.patch
> Based on the priority of the application, Capacity Scheduler should be able 
> to give preference to application while doing scheduling.
> Comparator<FiCaSchedulerApp> applicationComparator can be changed as below.   
> 1.    Check for Application priority. If priority is available, then return 
> the highest priority job.
> 2.    Otherwise continue with existing logic such as App ID comparison and 
> then TimeStamp comparison.

This message was sent by Atlassian JIRA

Reply via email to