Ashwin Shankar commented on YARN-3453:

hey Arun,
Thanks for working on this !
Couple more comments in addition to Karthik's comments  :
1. Why are we not using componentwisemin here ?
Resource target = Resources.min(calc, clusterResource,
           sched.getMinShare(), sched.getDemand());

2. FairScheduler.preemptResources() uses DefaultResourceCalculator and hence 
would look at only memory.
This could lead to a problem in the following scenario :
Preemption round0 : toPreempt = (100G,10 cores)
...<we preempt 10 containers of (4G,1 core)
Preemption round10 : toPreempt = (60G,0 cores)

In round10, we've satisfied all the cores, the current implementation since its 
based on DefaultResourceCalculator would continue to preempt
to satisfy the remaining 60G as well even in DRF, which means we just preempted 
more cores than we had to.
Making this calculator DRF wouldnt solve the problem as well, since we would 
then be preempting less than what is necessary.
The root of this problem is YARN-2154. I'll leave it upto you to decide what 
you want to do about this in this jira.
while (Resources.greaterThan(RESOURCE_CALCULATOR, clusterResource,
          toPreempt, Resources.none()))

3. Unit tests needs to be added.

bq.Looking at the remaining uses of DefaultResourceCalculator in FairScheduler, 
we could benefit from updating all of them to DominantResourceCalculator? 
Ashwin Shankar - do you concur?
Overall I see it would be beneficial. But I'm not so sure if the callers of 
FairScheduler.getResourceCalculator would be okay with getting 
DominantResourceCalculator always? I see its mostly called my Fair Reservation 
System feature.

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> ------------------------------------------------------------------------------------------------------------
>                 Key: YARN-3453
>                 URL: https://issues.apache.org/jira/browse/YARN-3453
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 2.6.0
>            Reporter: Ashwin Shankar
>            Assignee: Arun Suresh
>         Attachments: YARN-3453.1.patch, YARN-3453.2.patch
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --------------------------------------------------------------
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.

This message was sent by Atlassian JIRA

Reply via email to