Ashwin Shankar created YARN-2214:
Summary: preemptContainerPreCheck() in FSParentQueue delays
convergence towards fairness
Project: Hadoop YARN
Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Ashwin Shankar
preemptContainerPreCheck() in FSParentQueue rejects preemption requests if the
parent queue is below fair share. This can cause a delay in converging towards
fairness when the starved leaf queue and the queue above fairshare belong under
a non-root parent queue(ie their least common ancestor is a parent queue which
is not root).
Here is an example :
root.parent has fair share = 80% and usage = 80%
root.parent.child1 has fair share =40% usage = 80%
root.parent.child2 has fair share=40% usage=0%
Now a job is submitted to child2 and the demand is 40%.
Preemption will kick in and try to reclaim all the 40% from child1.
When it preempts the first container from child1,the usage of root.parent will
become <80%, which is less than root.parent's fair share,causing preemption to
stop.So only one container gets preempted in this round although the need is a
lot more. child2 would eventually get to half its fair share but only after
multiple rounds of preemption.
Solution is to remove preemptContainerPreCheck() in FSParentQueue and keep it
only in FSLeafQueue(which is already there).
This message was sent by Atlassian JIRA