[
https://issues.apache.org/jira/browse/YARN-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662132#comment-14662132
]
Wangda Tan commented on YARN-3937:
----------------------------------
[~sunilg],
Thanks for working on this, I just took a quick look at the patch.
I think the check of {{analyzeForPreemptionCancellation}} seems not correct to
me. In a big cluster, it will be (likely) there're always some containers will
be preempted in every execution of preemption check.
Instead, you can take a look at following code to see if you can leverage it.
{code}
// Keep the preempted list clean
for (Iterator<RMContainer> i = preempted.keySet().iterator(); i.hasNext();){
RMContainer id = i.next();
// garbage collect containers that are irrelevant for preemption
if (preempted.get(id) + 2 * maxWaitTime < clock.getTime()) {
i.remove();
}
}
{code}
And one minor suggestion is:
- CANCEL_PREEMPTION_FOR_CONTAINER -> CANCEL_CONTAINER_PREEMPTION
- containersToPreempt can be concurrent map to avoid synchronized lock of
removeContianerPreemption.
Thoughts?
> Introducing REMOVE_CONTAINER_FROM_PREEMPTION event to notify Scheduler and AM
> when a container is no longer to be preempted
> ---------------------------------------------------------------------------------------------------------------------------
>
> Key: YARN-3937
> URL: https://issues.apache.org/jira/browse/YARN-3937
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: capacityscheduler
> Affects Versions: 2.7.1
> Reporter: Sunil G
> Assignee: Sunil G
> Attachments: 0001-YARN-3937.patch, 0002-YARN-3937.patch
>
>
> As discussed in YARN-3784, there are scenarios like few other applications
> released containers or same application has revoked its resource requests. In
> these cases, we may not have to preempt a container which would have been
> marked for preemption earlier.
> Introduce a new event to remove such containers if present in the
> to-be-preempted list of scheduler or inform AM about such a scenario.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)