[ https://issues.apache.org/jira/browse/YARN-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662132#comment-14662132 ]
Wangda Tan commented on YARN-3937: ---------------------------------- [~sunilg], Thanks for working on this, I just took a quick look at the patch. I think the check of {{analyzeForPreemptionCancellation}} seems not correct to me. In a big cluster, it will be (likely) there're always some containers will be preempted in every execution of preemption check. Instead, you can take a look at following code to see if you can leverage it. {code} // Keep the preempted list clean for (Iterator<RMContainer> i = preempted.keySet().iterator(); i.hasNext();){ RMContainer id = i.next(); // garbage collect containers that are irrelevant for preemption if (preempted.get(id) + 2 * maxWaitTime < clock.getTime()) { i.remove(); } } {code} And one minor suggestion is: - CANCEL_PREEMPTION_FOR_CONTAINER -> CANCEL_CONTAINER_PREEMPTION - containersToPreempt can be concurrent map to avoid synchronized lock of removeContianerPreemption. Thoughts? > Introducing REMOVE_CONTAINER_FROM_PREEMPTION event to notify Scheduler and AM > when a container is no longer to be preempted > --------------------------------------------------------------------------------------------------------------------------- > > Key: YARN-3937 > URL: https://issues.apache.org/jira/browse/YARN-3937 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler > Affects Versions: 2.7.1 > Reporter: Sunil G > Assignee: Sunil G > Attachments: 0001-YARN-3937.patch, 0002-YARN-3937.patch > > > As discussed in YARN-3784, there are scenarios like few other applications > released containers or same application has revoked its resource requests. In > these cases, we may not have to preempt a container which would have been > marked for preemption earlier. > Introduce a new event to remove such containers if present in the > to-be-preempted list of scheduler or inform AM about such a scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)