[ https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621883#comment-14621883 ]
Peng Zhang commented on YARN-3453: ---------------------------------- I understood your thought. My suggestion is based on our practice: I found it's confusing to use different policy in queue configuration: eg. parent use fair, child use drf may cause child queue has no resource on cpu dimension, so job will hang there. So we use only drf in one cluster, and change the code to support setting the calculator class in scheduler scope. After review above comments, I am reminded that the case (0 GB, non-zero cores) like (non-zero GB, 0 cores) will also cause preempt more resources than necessary. I mentioned before: bq. To decrease this kind of waste, I want to found what's the ratio of demand can be fulfilled by resourceUpperBound, and use this ratio * resourceUpperBound to be targetResource. Actually, current implementation ignored the resource boundary of each requested container, so even after above logic, it still will has some waste. As for YARN-2154, if we want to only preempt containers can satisfy incoming request, IMHO, we should to do preemption for each incoming request instead count them up with {{resourceDeficit}}. > Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator > even in DRF mode causing thrashing > ------------------------------------------------------------------------------------------------------------ > > Key: YARN-3453 > URL: https://issues.apache.org/jira/browse/YARN-3453 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler > Affects Versions: 2.6.0 > Reporter: Ashwin Shankar > Assignee: Arun Suresh > Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, > YARN-3453.4.patch, YARN-3453.5.patch > > > There are two places in preemption code flow where DefaultResourceCalculator > is used, even in DRF mode. > Which basically results in more resources getting preempted than needed, and > those extra preempted containers aren’t even getting to the “starved” queue > since scheduling logic is based on DRF's Calculator. > Following are the two places : > 1. {code:title=FSLeafQueue.java|borderStyle=solid} > private boolean isStarved(Resource share) > {code} > A queue shouldn’t be marked as “starved” if the dominant resource usage > is >= fair/minshare. > 2. {code:title=FairScheduler.java|borderStyle=solid} > protected Resource resToPreempt(FSLeafQueue sched, long curTime) > {code} > -------------------------------------------------------------- > One more thing that I believe needs to change in DRF mode is : during a > preemption round,if preempting a few containers results in satisfying needs > of a resource type, then we should exit that preemption round, since the > containers that we just preempted should bring the dominant resource usage to > min/fair share. -- This message was sent by Atlassian JIRA (v6.3.4#6332)