Konstantinos Karanasos created YARN-6344:
--------------------------------------------

             Summary: Rethinking OFF_SWITCH locality in CapacityScheduler
                 Key: YARN-6344
                 URL: https://issues.apache.org/jira/browse/YARN-6344
             Project: Hadoop YARN
          Issue Type: Bug
          Components: capacityscheduler
            Reporter: Konstantinos Karanasos


When relaxing locality from node to rack, the {{node-locality-parameter}} is 
used: when scheduling opportunities for a scheduler key are more than the value 
of this parameter, we relax locality and try to assign the container to a node 
in the corresponding rack.

On the other hand, when relaxing locality to off-switch (i.e., assign the 
container anywhere in the cluster), we are using a {{localityWaitFactor}}, 
which is computed based on the number of outstanding requests for a specific 
scheduler key, which is divided by the size of the cluster. 
In case of applications that request containers in big batches (e.g., 
traditional MR jobs), and for relatively small clusters, the localityWaitFactor 
does not affect relaxing locality much.
However, in case of applications that request containers in small batches, this 
load factor takes a very small value, which leads to assigning off-switch 
containers too soon. This situation is even more pronounced in big clusters.
For example, if an application requests only one container per request, the 
locality will be relaxed after a single missed scheduling opportunity.

The purpose of this JIRA is to rethink the way we are relaxing locality for 
off-switch assignments.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to