[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-3453:
------------------------------
    Attachment: YARN-3453.3.patch

Uploading updated patch :

* Added unit-tests
* clean-up code based on comments

[~kasha],
bq. Nit: In each of the policies, my preference would be not make the 
calculator and comparator members static unless required. We have had cases 
where our tests would invoke multiple instances of the class leading to issues. 
Not that I foresee multiple instantiations for these classes, but would like to 
avoid it if we can.
If it ok with you, I feel we should infact make it static. Am of the opinion 
that the code reads better, is a lot cleaner and efficient, since only 1 
instance is ever created.. We are always at the liberty to over-ride the 
getComparator/Calculator method in test (and possible subclasses)

bq.  .. think we will have to fix YARN-2154 too.
On further thought.. and after consultation with [~kasha], Think we can 
decouple from that JIRA, given its larger scope.



> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-3453
>                 URL: https://issues.apache.org/jira/browse/YARN-3453
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 2.6.0
>            Reporter: Ashwin Shankar
>            Assignee: Arun Suresh
>         Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --------------------------------------------------------------
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to