[ 
https://issues.apache.org/jira/browse/YARN-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971707#comment-15971707
 ] 

Eric Payne edited comment on YARN-2113 at 4/17/17 9:58 PM:
-----------------------------------------------------------

[~leftnoteasy], it looks good in general, but I did discover a corner case that 
causes unnecessary preemption that will kill a container, give it back to the 
one killed, and kill it again ("flapping").

The following code returns true if {{user.used > user.userLimit}}:
{code:title=TempUserPerPartition#isUserLimitReached}
  public boolean isUserLimitReached(ResourceCalculator rc,
      Resource clusterResource) {
    if (Resources.greaterThan(rc, clusterResource, getUsedDeductAM(),
        userLimit)) {
      return true;
    }
    return false;
  }
{code}
The algorithm for Capacity Scheduler is to assign one container more than the 
user limit. So, if
   - {{user1 / app1}} is 1 container above their user limit
   - {{user2 / app2}} is below its user limit

The above {{isUserLimitReached}} method will return true, 1 container will be 
preempted from {{app1}}, {{user1}} will fall down to its user limit, and the 
Capacity Scheduler will give the container back to {{app1}}.

I can reproduce this regularly. Do you think this is a sufficient corner case 
that we can address it as part of a separate JIRA?


was (Author: eepayne):
[~leftnoteasy], it looks good in general, but I did discover a corner case that 
causes unnecessary preemption that will kill a container, give it back to the 
one killed, and kill it again ("flapping").

The following code returns true if {{user.used > user.userLimit}}:
{code:title=TempUserPerPartition#isUserLimitReached}
  public boolean isUserLimitReached(ResourceCalculator rc,
      Resource clusterResource) {
    if (Resources.greaterThan(rc, clusterResource, getUsedDeductAM(),
        userLimit)) {
      return true;
    }
    return false;
  }
{code}
The algorithm for Capacity Scheduler is to assign one container more than the 
user limit. So, if
   - {{user1 / app1}} is 1 container above their user limit
   - {{user2 / app2)) is below its user limit

The above {{isUserLimitReached}} method will return true, 1 container will be 
preempted from {{app1}}, {{user1}} will fall down to its user limit, and the 
Capacity Scheduler will give the container back to {{app1}}.

I can reproduce this regularly. Do you think this is a sufficient corner case 
that we can address it as part of a separate JIRA?

> Add cross-user preemption within CapacityScheduler's leaf-queue
> ---------------------------------------------------------------
>
>                 Key: YARN-2113
>                 URL: https://issues.apache.org/jira/browse/YARN-2113
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: scheduler
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Sunil G
>         Attachments: 
> TestNoIntraQueuePreemptionIfBelowUserLimitAndDifferentPrioritiesWithExtraUsers.txt,
>  YARN-2113.0001.patch, YARN-2113.0002.patch, YARN-2113.0003.patch, 
> YARN-2113.0004.patch, YARN-2113.0005.patch, YARN-2113.0006.patch, 
> YARN-2113.0007.patch, YARN-2113.v0.patch
>
>
> Preemption today only works across queues and moves around resources across 
> queues per demand and usage. We should also have user-level preemption within 
> a queue, to balance capacity across users in a predictable manner.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to