[
https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968656#comment-16968656
]
Eric Payne commented on YARN-8292:
----------------------------------
I backported this to branch-2 and attached YARN-8292.branch-2.009.patch. In my
manual tests on a 4-node pseudo cluster, it allows preemptions to proceed in
the case where the dominant resource is above the queue capacity and
non-dominant resource(s) is (are) less. However, I have not put the JIRA into
patch-submitted state because the two unit tests added to test preemption with
3 resources are failing. I dug into it a little bit and see that in 2.10, when
it allocates resources to the Mock queue, the extended resource is not added to
the current configuration or usage of the queue.
[~leftnoteasy] / [~sunilg] / [~jhung], are you aware of any missing extended
resource configuration that should be backported for the 2.10 RM / CS mocks?
Here is one of the test failures:
{noformat}
[ERROR]
TestProportionalCapacityPreemptionPolicyInterQueueWithDRF.test3ResourceTypesInterQueuePreemption:117
Wanted but not invoked:
eventHandler.handle(
<Is preemption request for>
);
-> at
org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.TestProportionalCapacityPreemptionPolicyInterQueueWithDRF.test3ResourceTypesInterQueuePreemption(TestProportionalCapacityPreemptionPolicyInterQueueWithDRF.java:117)
Actually, there were zero interactions with this mock.
{noformat}
> Fix the dominant resource preemption cannot happen when some of the resource
> vector becomes negative
> ----------------------------------------------------------------------------------------------------
>
> Key: YARN-8292
> URL: https://issues.apache.org/jira/browse/YARN-8292
> Project: Hadoop YARN
> Issue Type: Bug
> Components: yarn
> Reporter: Sumana Sathish
> Assignee: Wangda Tan
> Priority: Critical
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8292.001.patch, YARN-8292.002.patch,
> YARN-8292.003.patch, YARN-8292.004.patch, YARN-8292.005.patch,
> YARN-8292.006.patch, YARN-8292.007.patch, YARN-8292.008.patch,
> YARN-8292.009.patch, YARN-8292.branch-2.009.patch
>
>
> This is an example of the problem:
>
> {code}
> // guaranteed, max, used, pending
> "root(=[30:18:6 30:18:6 12:12:6 1:1:1]);" + //root
> "-a(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // a
> "-b(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // b
> "-c(=[10:6:2 10:6:2 0:0:0 1:1:1])"; // c
> {code}
> There're 3 resource types. Total resource of the cluster is 30:18:6
> For both of a/b, there're 3 containers running, each of container is 2:2:1.
> Queue c uses 0 resource, and have 1:1:1 pending resource.
> Under existing logic, preemption cannot happen.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]