[ 
https://issues.apache.org/jira/browse/YARN-7457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16291868#comment-16291868
 ] 

Konstantinos Karanasos commented on YARN-7457:
----------------------------------------------

Hi guys, I was just looking into this JIRA and I wanted to share some thoughts.

First, I agree that the way we specify delay should be decoupled from the 
scheduler implementation. So, there should be a pluggable policy that specifies 
if I want my scheduler to, say, relax locality based on missed opportunities or 
based on time.
I also see the need for each application to specify how "urgent" its requests 
are. That is, one app might want to relax locality after 1 sec and another 
might want to relax after 1 min.

What I do not see is use cases where one application will want to wait 5 missed 
opportunities and another wait 10 secs.
I am not sure even what are the semantics of such a placement for the scheduler.
More importantly, missed opportunities is an implementation detail of the way 
our schedulers work today, and is something that I think should not be surfaced 
to applications. Missed opportunities do not mean anything to apps, and in 
fact, like you say, the capacity scheduler interprets them differently from the 
fair scheduler, which is yet another reason it should not be surfaced. On the 
other hand, time is much more tangible and we could surface that.

To sum up, I think each application should specify its urgency of getting a 
specific container, but there should be a common way to specify that urgency 
across all applications, rather than having some specify missed opportunities 
and others specify time. We can have a scheduler-wide pluggable policy 
determining if the "urgency" will be time or missed opportunities, and then 
have applications give a number, specifying how urgent their demand is. In a 
first cut, this number can be directly translated to seconds or missed 
opportunities by our schedulers.

Looping in [~subru] -- I just ran the above ideas by him. He also mentioned 
that we should not surface more the missed opportunities. Moreover, he took a 
step further the notion of urgency I mentioned above, saying that ultimately we 
want applications to specify the utility of their requests as a cost function.

Thoughts?

> Delay scheduling should be an individual policy instead of part of scheduler 
> implementation
> -------------------------------------------------------------------------------------------
>
>                 Key: YARN-7457
>                 URL: https://issues.apache.org/jira/browse/YARN-7457
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Wangda Tan
>            Assignee: Tao Yang
>
> Currently, different schedulers have slightly different delay scheduling 
> implementations. Ideally we should make delay scheduling independent from 
> scheduler implementation. Benefits of doing this:
> 1) Applications can choose which delay scheduling policy to use, it could be 
> time-based / missed-opportunistic-based or whatever new delay scheduling 
> policy supported by the cluster. Now it is global config of scheduler.
> 2) Make scheduler implementations simpler and reusable.
> h2. {color:red}Running design doc: 
> https://docs.google.com/document/d/1rY-CJPLbGk3Xj_8sxre61y2YkHJFK8oqKOshro1ZY3A/edit#heading=h.xnzvh9nn283a{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to