Chang Li created YARN-4059:
------------------------------

             Summary: Preemption should delay assignments back to the preempted 
queue
                 Key: YARN-4059
                 URL: https://issues.apache.org/jira/browse/YARN-4059
             Project: Hadoop YARN
          Issue Type: Bug
            Reporter: Chang Li
            Assignee: Chang Li




When preempting containers from a queue it can take a while for the other 
queues to fully consume the resources that were freed up, due to delays waiting 
for better locality, etc. Those delays can cause the resources to be assigned 
back to the preempted queue, and then the preemption cycle continues.

We should consider adding a delay, either based on node heartbeat counts or 
time, to avoid granting containers to a queue that was recently preempted. The 
delay should be sufficient to cover the cycles of the preemption monitor, so we 
won't try to assign containers in-between preemption events for a queue.

Worst-case scenario for assigning freed resources to other queues is when all 
the other queues want no locality. No locality means only one container is 
assigned per heartbeat, so we need to wait for the entire cluster heartbeating 
in times the number of containers that could run on a single node.

So the "penalty time" for a queue should be the max of either the preemption 
monitor cycle time or the amount of time it takes to allocate the cluster with 
one container per heartbeat. Guessing this will be somewhere around 2 minutes.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to