[jira] [Commented] (YARN-2069) CS queue level preemption should respect user-limits

Wangda Tan (JIRA) Sun, 03 Aug 2014 21:08:02 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14084279#comment-14084279
 ]


Wangda Tan commented on YARN-2069:
----------------------------------

Hi [~mayank_bansal],
Thanks for your patience.

I've just read through your new patch. 

After #1/#2, if there's more resource need preempt, AM container will be 
preempted. Is it corect? Please let me know if I misread your approach.

*I think we should discuss scope of this JIRA first, I'm a little confused 
after thought about it.*

According to the desc of this JIRA,
we need make sure: (Assume we calculated {{target-user-limit}} already).
*REQ #1:* When consider preempt a container from user-x, if {{used-resource - 
marked-preempted-resource}} of user-x already <= {{target-user-limit}}. We need 
make sure, no any other user in the queue has {{used-resource - 
marked-preempted-resource}} > {{target-user-limit}}.
*REQ #2:* When we have to preempt an AM container, we need make sure #1 too.

And as commented by [~vinodkv]: 
https://issues.apache.org/jira/browse/YARN-2069?focusedCommentId=14064047&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14064047.
*REQ #3:* User's resource after preemption should be as balanced as possible 
around {{target-user-limit}}

Do you agree with these requirements? I think we should update requirements to 
JIRA desc if we decided.

* My understanding of your new patch consists of two phases:*
1. {{distributePreemptionforUsers}} will do preemption to enforce 
{{target-user-limit}} for each user.
2. If there's more resource need preempted, will call 
{{distributePreemptionforUsers}} to make sure {{resToObtain}} is distributed to 
{{resToObtain}} divide {{#active-user}} in the queue.

I think phase-1 can enforce REQ#1. But phase-2 cannot enforce REQ#3. And also, 
REQ#2 cannot be satisfied in the patch.

Let me give you an example about why REQ#3 not satisfied, similar to Vinod's 
example:
{code}
Queue has guaranteed resource = 30%, now it used 60%, want to shrink it down to 
40%.
Container size are equal, which is 3% of the cluster.
Now 5 app in the queue, user-limit configured to 20%. So expected resource are 
{8%, 8%, 8%, 8%, 8%}.

Before preemption:
{15%, 9%, 12%, 12%, 12%}

It is possible after preemption in your current appoarch:
{15%, 6%, %6, %6, %6} (total is 39%)
{code}

Sometimes we cannot get all user's resource exactly same to 
{{target-user-limit}} because contianer size may not divisible by 
{{target-user-limit}}. But we can do better in following example
{code}
After preemption:
{9%, 9%, %9, %6, %6} (total is 39%)
{code}

The unbalanced happened caused by accumulated bias I mentioned in my comment: 
https://issues.apache.org/jira/browse/YARN-2069?focusedCommentId=14074249&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14074249


Thanks,
Wangda

> CS queue level preemption should respect user-limits
> ----------------------------------------------------
>
>                 Key: YARN-2069
>                 URL: https://issues.apache.org/jira/browse/YARN-2069
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacityscheduler
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Mayank Bansal
>         Attachments: YARN-2069-trunk-1.patch, YARN-2069-trunk-2.patch, 
> YARN-2069-trunk-3.patch, YARN-2069-trunk-4.patch, YARN-2069-trunk-5.patch, 
> YARN-2069-trunk-6.patch, YARN-2069-trunk-7.patch, YARN-2069-trunk-8.patch, 
> YARN-2069-trunk-9.patch
>
>
> This is different from (even if related to, and likely share code with) 
> YARN-2113.
> YARN-2113 focuses on making sure that even if queue has its guaranteed 
> capacity, it's individual users are treated in-line with their limits 
> irrespective of when they join in.
> This JIRA is about respecting user-limits while preempting containers to 
> balance queue capacities.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2069) CS queue level preemption should respect user-limits

Reply via email to