[
https://issues.apache.org/jira/browse/YARN-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383513#comment-14383513
]
zhihai xu commented on YARN-3405:
---------------------------------
It looks like the code will still check queue-1-1(leaf queue) even
queue-1(parent queue) is not over fair share.
This is the code for FSParentQueue#preemptContainer, for this case
candidateQueue will become queue-1 because candidateQueue is null at the
beginning.
{code}
public RMContainer preemptContainer() {
RMContainer toBePreempted = null;
// Find the childQueue which is most over fair share
FSQueue candidateQueue = null;
Comparator<Schedulable> comparator = policy.getComparator();
for (FSQueue queue : childQueues) {
if (candidateQueue == null ||
comparator.compare(queue, candidateQueue) > 0) {
candidateQueue = queue;
}
}
// Let the selected queue choose which of its container to preempt
if (candidateQueue != null) {
toBePreempted = candidateQueue.preemptContainer();
}
return toBePreempted;
}
{code}
Only leaf queue will not be checked if it is not over fair share.
The following is the code for FSLeafQueue#preemptContainer
{code}
public RMContainer preemptContainer() {
RMContainer toBePreempted = null;
// If this queue is not over its fair share, reject
if (!preemptContainerPreCheck()) {
return toBePreempted;
}
if (LOG.isDebugEnabled()) {
LOG.debug("Queue " + getName() + " is going to preempt a container " +
"from its applications.");
}
// Choose the app that is most over fair share
Comparator<Schedulable> comparator = policy.getComparator();
FSAppAttempt candidateSched = null;
readLock.lock();
try {
for (FSAppAttempt sched : runnableApps) {
if (candidateSched == null ||
comparator.compare(sched, candidateSched) > 0) {
candidateSched = sched;
}
}
} finally {
readLock.unlock();
}
// Preempt from the selected app
if (candidateSched != null) {
toBePreempted = candidateSched.preemptContainer();
}
return toBePreempted;
}
{code}
preemptContainerPreCheck is only called at leaf queue. So for this case, leaf
queue queue-1-1 is over fair share, it will be preempted.
Do I miss the code which prevent queue-1(parent queue) to be recursively
preempted?
> FairScheduler's preemption cannot happen between sibling in some case
> ---------------------------------------------------------------------
>
> Key: YARN-3405
> URL: https://issues.apache.org/jira/browse/YARN-3405
> Project: Hadoop YARN
> Issue Type: Bug
> Components: fairscheduler
> Affects Versions: 2.7.0
> Reporter: Peng Zhang
> Priority: Critical
>
> Queue hierarchy described as below:
> {noformat}
> root
> |
> queue-1
> / \
> queue-1-1 queue-1-2
> {noformat}
> 1. When queue-1-1 is active and it has been assigned with all resources.
> 2. When queue-1-2 is active, and it cause some new preemption request.
> 3. But when do preemption, it now starts from root, and found queue-1 is not
> over fairshare, so no recursion preemption to queue-1-1.
> 4. Finally queue-1-2 will be waiting for resource release form queue-1-1
> itself.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)