[
https://issues.apache.org/jira/browse/YARN-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16193249#comment-16193249
]
Daniel Templeton commented on YARN-7290:
----------------------------------------
Looks generally good. I'll need to take a more careful look, though.
> canContainerBePreempted can return true when it shouldn't
> ---------------------------------------------------------
>
> Key: YARN-7290
> URL: https://issues.apache.org/jira/browse/YARN-7290
> Project: Hadoop YARN
> Issue Type: Bug
> Components: fairscheduler
> Affects Versions: 3.0.0-beta1
> Reporter: Steven Rand
> Assignee: Steven Rand
> Attachments: YARN-7290.001.patch, YARN-7290.002.patch,
> YARN-7290-failing-test.patch
>
>
> In FSAppAttempt#canContainerBePreempted, we make sure that preempting the
> given container would not put the app below its fair share:
> {code}
> // Check if the app's allocation will be over its fairshare even
> // after preempting this container
> Resource usageAfterPreemption = Resources.clone(getResourceUsage());
> // Subtract resources of containers already queued for preemption
> synchronized (preemptionVariablesLock) {
> Resources.subtractFrom(usageAfterPreemption, resourcesToBePreempted);
> }
> // Subtract this container's allocation to compute usage after preemption
> Resources.subtractFrom(
> usageAfterPreemption, container.getAllocatedResource());
> return !isUsageBelowShare(usageAfterPreemption, getFairShare());
> {code}
> However, this only considers one container in isolation, and fails to
> consider containers for the same app that we already added to
> {{preemptableContainers}} in
> FSPreemptionThread#identifyContainersToPreemptOnNode. Therefore we can have a
> case where we preempt multiple containers from the same app, none of which by
> itself puts the app below fair share, but which cumulatively do so.
> I've attached a patch with a test to show this behavior. The flow is:
> 1. Initially greedyApp runs in {{root.preemptable.child-1}} and is allocated
> all the resources (8g and 8vcores)
> 2. Then starvingApp runs in {{root.preemptable.child-2}} and requests 2
> containers, each of which is 3g and 3vcores in size. At this point both
> greedyApp and starvingApp have a fair share of 4g (with DRF not in use).
> 3. For the first container requested by starvedApp, we (correctly) preempt 3
> containers from greedyApp, each of which is 1g and 1vcore.
> 4. For the second container requested by starvedApp, we again (this time
> incorrectly) preempt 3 containers from greedyApp. This puts greedyApp below
> its fair share, but happens anyway because all six times that we call
> {{return !isUsageBelowShare(usageAfterPreemption, getFairShare());}}, the
> value of {{usageAfterPreemption}} is 7g and 7vcores (confirmed using
> debugger).
> So in addition to accounting for {{resourcesToBePreempted}}, we also need to
> account for containers that we're already planning on preempting in
> FSPreemptionThread#identifyContainersToPreemptOnNode.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]