Kannan Rajah created YARN-2989:
----------------------------------

             Summary: Better Load Balancing in Fair Scheduler
                 Key: YARN-2989
                 URL: https://issues.apache.org/jira/browse/YARN-2989
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: fairscheduler
    Affects Versions: 2.5.0
            Reporter: Kannan Rajah


While porting Fair Scheduler from MR1, we seem to have changed the logic behind 
task distribution across nodes (MAPREDUCE-3451).

In MR1, a load factor was computed using runnableMaps/totalMapSlots and this 
was used to determine how many tasks need to be given to a node such that the 
overall cluster load is evenly distributed. In one heartbeat, we could assign 
multiple tasks. In YARN, we have the option to assign multiple tasks to a node, 
but this is disabled by default (YARN-302). Even when it is enabled, the number 
of tasks to assign is statically configured. So it won't ensure that load is 
evenly distributed. Why not bring back the load factor based check? Any reason 
why it was not done? This is actually more relevant with label based scheduling.

If there are no objections, I would like to implement it for both normal and 
label based scheduling scenarios.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to