[ https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15099088#comment-15099088 ]
Eric Payne commented on YARN-4108: ---------------------------------- [~leftnoteasy], great job! This approach looks like it has potential to vastly improve preemption. I just have a few comments and questions. - In the lazy preemption case, PCPP will send an event to the scheduler to mark a container killable. Can PCPP check if it's already been marked before sending, so that maybe event traffic will be less in the RM? - Currently, if both queueA and queueB are over their guaranteed capacity, preemption will still occur if queueA is more over capacity than queueB. I think it is probably important to preserve this behavior (YARN-2592). -- I don't see anyplace where {{ResourceLimits#isAllowPreemption}} is called. But, if it is, Will the following code in {{LeafQueue}} change preemption behavior? {noformat} private void setPreemptionAllowed(ResourceLimits limits, String nodePartition) { // Set preemption-allowed: // For leaf queue, only under-utilized queue is allowed to preempt resources from other queues float usedCapacity = queueCapacities.getAbsoluteUsedCapacity(nodePartition); float guaranteedCapacity = queueCapacities.getAbsoluteCapacity(nodePartition); limits.setIsAllowPreemption(usedCapacity < guaranteedCapacity); } {noformat} -- Also, in {{ParentQueue#canAssign}}, does the following code have the same effect? {noformat} if (this.getQueueCapacities().getUsedCapacity(node.getPartition()) < 1.0f) { {noformat} - In {{AbstractCSQueue#canAssignToThisQueue}}: -- I'm just trying to understand how things will be affected when headroom for a parent queue is (limit - used) + killable. Doesn't that say that a parent queue has more headroom than it's already acutally using? Is it relying on this behavior so that the {{assignment}} code will determine that it has more headroom when there are killable containers, and then rely on the leafqueue to kill those containers? -- NPE if {{getChildQueues()}} returns null {noformat} if (null != getChildQueues() || !getChildQueues().isEmpty()) { {noformat} - {{CSAssignment#toKillContainers}}: I would call them {{containersToKill}} {quote} 4. I would like to have some freedom in selecting conatiners (marking) for preemption. A simple sorting based on submission time or priority seems limited approach. Could we have some interface here so that we can plugin user specific comparision cases. submission time priority demand based etc may be {quote} - To [~sunilg]'s point: Currently PCPP doesn't take into consideration things like locality or container size. If a queue is over its capacity by 8GB, and there are 1 8GB container plus 8 1GB containers, PCPP may decide to kill the 1 8GB contaienr or it may decide to kill the 8 1GB containers, depending on properties like 'time since submission' and 'ignore-partition-exclusivity'. So, with the current, lazy preemption proposal, if the underserved queue needs an 8GB container and the 8 1GB containers are marked as killable, at least now those containers don't get killed. It's a step in the right direction, but the underserved queue still has to wait. Same kind of thing with locality and other properties. It would be interesting to know what your thoughts are on making further modifications to PCPP to make more informed choices about which containers to kill. There may not be a "right" choice in PCPP, though, since the requirements of the underserved queue may change by the time the scheduler gets around to allocating resources. > CapacityScheduler: Improve preemption to preempt only those containers that > would satisfy the incoming request > -------------------------------------------------------------------------------------------------------------- > > Key: YARN-4108 > URL: https://issues.apache.org/jira/browse/YARN-4108 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler > Reporter: Wangda Tan > Assignee: Wangda Tan > Attachments: YARN-4108-design-doc-V3.pdf, > YARN-4108-design-doc-v1.pdf, YARN-4108-design-doc-v2.pdf, > YARN-4108.poc.1.patch, YARN-4108.poc.2-WIP.patch > > > This is sibling JIRA for YARN-2154. We should make sure container preemption > is more effective. > *Requirements:*: > 1) Can handle case of user-limit preemption > 2) Can handle case of resource placement requirements, such as: hard-locality > (I only want to use rack-1) / node-constraints (YARN-3409) / black-list (I > don't want to use rack1 and host\[1-3\]) > 3) Can handle preemption within a queue: cross user preemption (YARN-2113), > cross applicaiton preemption (such as priority-based (YARN-1963) / > fairness-based (YARN-3319)). -- This message was sent by Atlassian JIRA (v6.3.4#6332)