Allen Wittenauer updated YARN-2151:
    Labels: BB2015-05-TBR  (was: )

> FairScheduler option for global preemption within hierarchical queues
> ---------------------------------------------------------------------
>                 Key: YARN-2151
>                 URL: https://issues.apache.org/jira/browse/YARN-2151
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: fairscheduler
>            Reporter: Andrey Stepachev
>              Labels: BB2015-05-TBR
>         Attachments: YARN-2151.patch
> FairScheduler has hierarchical queues, but fair share calculation and 
> preemption still works withing a limited range and effectively still 
> nonhierarchical.
> This patch solves this incompleteness in two aspects:
> 1. Currently MinShare is not propagated to upper queue, that leads to
> fair share calculation ignores all Min Shares in deeper queues. 
> Lets take an example
> (implemented as test case TestFairScheduler#testMinShareInHierarchicalQueues)
> {code}
> <?xml version="1.0"?>
> <allocations>
> <queue name="queue1">
>   <maxResources>10240mb, 10vcores</maxResources>
>   <queue name="big"/>
>   <queue name="sub1">
>     <schedulingPolicy>fair</schedulingPolicy>
>     <queue name="sub11">
>       <minResources>6192mb, 6vcores</minResources>
>     </queue>
>   </queue>
>   <queue name="sub2">
>   </queue>
> </queue>
> </allocations>
> {code}
> Then bigApp started within queue1.big with 10x1GB containers.
> That effectively eats all maximum allowed resources for queue1.
> Subsequent requests for app1 (queue1.sub1.sub11) and 
> app2 (queue1.sub2) (5x1GB each) will wait for free resources. 
> Take a note, that sub11 has min share requirements for 6x1GB.
> Without given patch fair share will be calculated with no knowledge 
> about min share requirements and app1 and app2 will get equal 
> number of containers.
> With the patch resources will split according to min share ( in test
> it will be 5 for app1 and 1 for app2)
> That behaviour controlled by the same parameter as ‘globalPreemtion’,
> but that can be changed easily.
> Implementation is a bit awkward, but seems that method for min share
> recalculation can be exposed as public or protected api and constructor
> in FSQueue can call it before using minShare getter. But right now
> current implementation with nulls should work too.
> 2. Preemption doesn’t works between queues on different level for the
> queues hierarchy. Moreover, it is not possible to override various 
> parameters for children queues. 
> This patch adds parameter ‘globalPreemption’, which enables global 
> preemption algorithm modifications.
> In a nutshell patch adds function shouldAttemptPreemption(queue),
> which can calculate usage for nested queues, and if queue with usage more 
> that specified threshold is found, preemption can be triggered.
> Aggregated minShare does the rest of work and preemption will work
> as expected within hierarchy of queues with different MinShare/MaxShare
> specifications on different levels.
> Test case TestFairScheduler#testGlobalPreemption depicts how it works.
> One big app gets resources above its fair share and app1 has a declared
> min share. On submission code finds that starvation and preempts enough
> containers to give enough room for app1.

This message was sent by Atlassian JIRA

Reply via email to