[ https://issues.apache.org/jira/browse/YARN-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Allen Wittenauer updated YARN-2151: ----------------------------------- Labels: BB2015-05-TBR (was: ) > FairScheduler option for global preemption within hierarchical queues > --------------------------------------------------------------------- > > Key: YARN-2151 > URL: https://issues.apache.org/jira/browse/YARN-2151 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler > Reporter: Andrey Stepachev > Labels: BB2015-05-TBR > Attachments: YARN-2151.patch > > > FairScheduler has hierarchical queues, but fair share calculation and > preemption still works withing a limited range and effectively still > nonhierarchical. > This patch solves this incompleteness in two aspects: > 1. Currently MinShare is not propagated to upper queue, that leads to > fair share calculation ignores all Min Shares in deeper queues. > Lets take an example > (implemented as test case TestFairScheduler#testMinShareInHierarchicalQueues) > {code} > <?xml version="1.0"?> > <allocations> > <queue name="queue1"> > <maxResources>10240mb, 10vcores</maxResources> > <queue name="big"/> > <queue name="sub1"> > <schedulingPolicy>fair</schedulingPolicy> > <queue name="sub11"> > <minResources>6192mb, 6vcores</minResources> > </queue> > </queue> > <queue name="sub2"> > </queue> > </queue> > </allocations> > {code} > Then bigApp started within queue1.big with 10x1GB containers. > That effectively eats all maximum allowed resources for queue1. > Subsequent requests for app1 (queue1.sub1.sub11) and > app2 (queue1.sub2) (5x1GB each) will wait for free resources. > Take a note, that sub11 has min share requirements for 6x1GB. > Without given patch fair share will be calculated with no knowledge > about min share requirements and app1 and app2 will get equal > number of containers. > With the patch resources will split according to min share ( in test > it will be 5 for app1 and 1 for app2) > That behaviour controlled by the same parameter as ‘globalPreemtion’, > but that can be changed easily. > Implementation is a bit awkward, but seems that method for min share > recalculation can be exposed as public or protected api and constructor > in FSQueue can call it before using minShare getter. But right now > current implementation with nulls should work too. > 2. Preemption doesn’t works between queues on different level for the > queues hierarchy. Moreover, it is not possible to override various > parameters for children queues. > This patch adds parameter ‘globalPreemption’, which enables global > preemption algorithm modifications. > In a nutshell patch adds function shouldAttemptPreemption(queue), > which can calculate usage for nested queues, and if queue with usage more > that specified threshold is found, preemption can be triggered. > Aggregated minShare does the rest of work and preemption will work > as expected within hierarchy of queues with different MinShare/MaxShare > specifications on different levels. > Test case TestFairScheduler#testGlobalPreemption depicts how it works. > One big app gets resources above its fair share and app1 has a declared > min share. On submission code finds that starvation and preempts enough > containers to give enough room for app1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)