[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621883#comment-14621883
 ] 

Peng Zhang commented on YARN-3453:
----------------------------------

I understood your thought. My suggestion is based on our practice: I found it's 
confusing to use different policy in queue configuration: eg. parent use fair, 
child use drf may cause child queue has no resource on cpu dimension, so job 
will hang there. So we use only drf in one cluster, and change the code to 
support setting the calculator class in scheduler scope.

After review above comments, I am reminded that the case (0 GB, non-zero cores) 
like (non-zero GB, 0 cores) will also cause preempt more resources than 
necessary.

I mentioned before:
bq. To decrease this kind of waste, I want to found what's the ratio of demand 
can be fulfilled by resourceUpperBound, and use this ratio * resourceUpperBound 
to be targetResource.
Actually, current implementation ignored the resource boundary of each 
requested container, so even after above logic, it still will has some waste.

As for YARN-2154, if we want to only preempt containers can satisfy incoming 
request, IMHO, we should to do preemption for each incoming request instead 
count them up with {{resourceDeficit}}.

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-3453
>                 URL: https://issues.apache.org/jira/browse/YARN-3453
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 2.6.0
>            Reporter: Ashwin Shankar
>            Assignee: Arun Suresh
>         Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch, YARN-3453.5.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --------------------------------------------------------------
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to