[ 
https://issues.apache.org/jira/browse/YARN-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14029768#comment-14029768
 ] 

Andrey Stepachev commented on YARN-2151:
----------------------------------------

Actually there is not much code for the preemption itself, but more about Min 
Share. 
So, this patch can be applied (after rb of course) and should not contradict or 
interfere
with future changes in container preemption logic.

> FairScheduler option for global preemption within hierarchical queues
> ---------------------------------------------------------------------
>
>                 Key: YARN-2151
>                 URL: https://issues.apache.org/jira/browse/YARN-2151
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: fairscheduler
>            Reporter: Andrey Stepachev
>         Attachments: YARN-2151.patch
>
>
> FairScheduler has hierarchical queues, but fair share calculation and 
> preemption still works withing a limited range and effectively still 
> nonhierarchical.
> This patch solves this incompleteness in two aspects:
> 1. Currently MinShare is not propagated to upper queue, that leads to
> fair share calculation ignores all Min Shares in deeper queues. 
> Lets take an example
> (implemented as test case TestFairScheduler#testMinShareInHierarchicalQueues)
> {code}
> <?xml version="1.0"?>
> <allocations>
> <queue name="queue1">
>   <maxResources>10240mb, 10vcores</maxResources>
>   <queue name="big"/>
>   <queue name="sub1">
>     <schedulingPolicy>fair</schedulingPolicy>
>     <queue name="sub11">
>       <minResources>6192mb, 6vcores</minResources>
>     </queue>
>   </queue>
>   <queue name="sub2">
>   </queue>
> </queue>
> </allocations>
> {code}
> Then bigApp started within queue1.big with 10x1GB containers.
> That effectively eats all maximum allowed resources for queue1.
> Subsequent requests for app1 (queue1.sub1.sub11) and 
> app2 (queue1.sub2) (5x1GB each) will wait for free resources. 
> Take a note, that sub11 has min share requirements for 6x1GB.
> Without given patch fair share will be calculated with no knowledge 
> about min share requirements and app1 and app2 will get equal 
> number of containers.
> With the patch resources will split according to min share ( in test
> it will be 5 for app1 and 1 for app2)
> That behaviour controlled by the same parameter as ‘globalPreemtion’,
> but that can be changed easily.
> Implementation is a bit awkward, but seems that method for min share
> recalculation can be exposed as public or protected api and constructor
> in FSQueue can call it before using minShare getter. But right now
> current implementation with nulls should work too.
> 2. Preemption doesn’t works between queues on different level for the
> queues hierarchy. Moreover, it is not possible to override various 
> parameters for children queues. 
> This patch adds parameter ‘globalPreemption’, which enables global 
> preemption algorithm modifications.
> In a nutshell patch adds function shouldAttemptPreemption(queue),
> which can calculate usage for nested queues, and if queue with usage more 
> that specified threshold is found, preemption can be triggered.
> Aggregated minShare does the rest of work and preemption will work
> as expected within hierarchy of queues with different MinShare/MaxShare
> specifications on different levels.
> Test case TestFairScheduler#testGlobalPreemption depicts how it works.
> One big app gets resources above its fair share and app1 has a declared
> min share. On submission code finds that starvation and preempts enough
> containers to give enough room for app1.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to