[ https://issues.apache.org/jira/browse/YARN-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383513#comment-14383513 ]
zhihai xu commented on YARN-3405: --------------------------------- It looks like the code will still check queue-1-1(leaf queue) even queue-1(parent queue) is not over fair share. This is the code for FSParentQueue#preemptContainer, for this case candidateQueue will become queue-1 because candidateQueue is null at the beginning. {code} public RMContainer preemptContainer() { RMContainer toBePreempted = null; // Find the childQueue which is most over fair share FSQueue candidateQueue = null; Comparator<Schedulable> comparator = policy.getComparator(); for (FSQueue queue : childQueues) { if (candidateQueue == null || comparator.compare(queue, candidateQueue) > 0) { candidateQueue = queue; } } // Let the selected queue choose which of its container to preempt if (candidateQueue != null) { toBePreempted = candidateQueue.preemptContainer(); } return toBePreempted; } {code} Only leaf queue will not be checked if it is not over fair share. The following is the code for FSLeafQueue#preemptContainer {code} public RMContainer preemptContainer() { RMContainer toBePreempted = null; // If this queue is not over its fair share, reject if (!preemptContainerPreCheck()) { return toBePreempted; } if (LOG.isDebugEnabled()) { LOG.debug("Queue " + getName() + " is going to preempt a container " + "from its applications."); } // Choose the app that is most over fair share Comparator<Schedulable> comparator = policy.getComparator(); FSAppAttempt candidateSched = null; readLock.lock(); try { for (FSAppAttempt sched : runnableApps) { if (candidateSched == null || comparator.compare(sched, candidateSched) > 0) { candidateSched = sched; } } } finally { readLock.unlock(); } // Preempt from the selected app if (candidateSched != null) { toBePreempted = candidateSched.preemptContainer(); } return toBePreempted; } {code} preemptContainerPreCheck is only called at leaf queue. So for this case, leaf queue queue-1-1 is over fair share, it will be preempted. Do I miss the code which prevent queue-1(parent queue) to be recursively preempted? > FairScheduler's preemption cannot happen between sibling in some case > --------------------------------------------------------------------- > > Key: YARN-3405 > URL: https://issues.apache.org/jira/browse/YARN-3405 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler > Affects Versions: 2.7.0 > Reporter: Peng Zhang > Priority: Critical > > Queue hierarchy described as below: > {noformat} > root > | > queue-1 > / \ > queue-1-1 queue-1-2 > {noformat} > 1. When queue-1-1 is active and it has been assigned with all resources. > 2. When queue-1-2 is active, and it cause some new preemption request. > 3. But when do preemption, it now starts from root, and found queue-1 is not > over fairshare, so no recursion preemption to queue-1-1. > 4. Finally queue-1-2 will be waiting for resource release form queue-1-1 > itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)