[
https://issues.apache.org/jira/browse/YARN-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14084279#comment-14084279
]
Wangda Tan commented on YARN-2069:
----------------------------------
Hi [~mayank_bansal],
Thanks for your patience.
I've just read through your new patch.
After #1/#2, if there's more resource need preempt, AM container will be
preempted. Is it corect? Please let me know if I misread your approach.
*I think we should discuss scope of this JIRA first, I'm a little confused
after thought about it.*
According to the desc of this JIRA,
we need make sure: (Assume we calculated {{target-user-limit}} already).
*REQ #1:* When consider preempt a container from user-x, if {{used-resource -
marked-preempted-resource}} of user-x already <= {{target-user-limit}}. We need
make sure, no any other user in the queue has {{used-resource -
marked-preempted-resource}} > {{target-user-limit}}.
*REQ #2:* When we have to preempt an AM container, we need make sure #1 too.
And as commented by [~vinodkv]:
https://issues.apache.org/jira/browse/YARN-2069?focusedCommentId=14064047&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14064047.
*REQ #3:* User's resource after preemption should be as balanced as possible
around {{target-user-limit}}
Do you agree with these requirements? I think we should update requirements to
JIRA desc if we decided.
* My understanding of your new patch consists of two phases:*
1. {{distributePreemptionforUsers}} will do preemption to enforce
{{target-user-limit}} for each user.
2. If there's more resource need preempted, will call
{{distributePreemptionforUsers}} to make sure {{resToObtain}} is distributed to
{{resToObtain}} divide {{#active-user}} in the queue.
I think phase-1 can enforce REQ#1. But phase-2 cannot enforce REQ#3. And also,
REQ#2 cannot be satisfied in the patch.
Let me give you an example about why REQ#3 not satisfied, similar to Vinod's
example:
{code}
Queue has guaranteed resource = 30%, now it used 60%, want to shrink it down to
40%.
Container size are equal, which is 3% of the cluster.
Now 5 app in the queue, user-limit configured to 20%. So expected resource are
{8%, 8%, 8%, 8%, 8%}.
Before preemption:
{15%, 9%, 12%, 12%, 12%}
It is possible after preemption in your current appoarch:
{15%, 6%, %6, %6, %6} (total is 39%)
{code}
Sometimes we cannot get all user's resource exactly same to
{{target-user-limit}} because contianer size may not divisible by
{{target-user-limit}}. But we can do better in following example
{code}
After preemption:
{9%, 9%, %9, %6, %6} (total is 39%)
{code}
The unbalanced happened caused by accumulated bias I mentioned in my comment:
https://issues.apache.org/jira/browse/YARN-2069?focusedCommentId=14074249&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14074249
Thanks,
Wangda
> CS queue level preemption should respect user-limits
> ----------------------------------------------------
>
> Key: YARN-2069
> URL: https://issues.apache.org/jira/browse/YARN-2069
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: capacityscheduler
> Reporter: Vinod Kumar Vavilapalli
> Assignee: Mayank Bansal
> Attachments: YARN-2069-trunk-1.patch, YARN-2069-trunk-2.patch,
> YARN-2069-trunk-3.patch, YARN-2069-trunk-4.patch, YARN-2069-trunk-5.patch,
> YARN-2069-trunk-6.patch, YARN-2069-trunk-7.patch, YARN-2069-trunk-8.patch,
> YARN-2069-trunk-9.patch
>
>
> This is different from (even if related to, and likely share code with)
> YARN-2113.
> YARN-2113 focuses on making sure that even if queue has its guaranteed
> capacity, it's individual users are treated in-line with their limits
> irrespective of when they join in.
> This JIRA is about respecting user-limits while preempting containers to
> balance queue capacities.
--
This message was sent by Atlassian JIRA
(v6.2#6252)