Maysam Yabandeh commented on YARN-1969:

Thanks for the comment [~kasha]. I think this is a good point to distinguish 
between the terms "deadline" and "endtime".

"deadline" would be the user-specified SLA and as you correctly mentioned in 
many cases it is quite likely to be missed due to failures, limited resources, 
etc. Still the user can express the level of urgency by the desired deadline, 
but they could also do that via priorities, so the user-specified deadline 
would be a complementary (and perhaps more expressive) way for users to specify 
the priorities of their jobs.

"endtime", on the other hand, is the estimated end time of the job based on the 
current progress and assuming that the RM will give the rest of the required 
resources immediately. endtime is automatically computed by the AppMaster and 
there is no need for user involvement. When scheduling resources, the advantage 
of taking endtime into consideration is that the giant jobs that are close to 
be finished could be prioritized. We in general want to have such jobs finished 
sooner since (i) they would release the resources that they have occupied such 
as the disk space for the mappers' output, (ii) a large job is more susceptible 
to failures and the longer they are hanging around , the more is the likelihood 
of being affected by a loss of a mapper node.

The added subtasks are based on the agenda of (i) estimating the end time, (ii) 
sending it over to RM, (iii) letting RM take it into consideration. We can also 
extend the API to allow the users to specify their desired deadline. As for how 
RM take the specified deadline or estimated endtime into consideration, I think 
once we have the "endtime" field available in RM, there will be many new 
opportunities to take advantage of it. One way as you mentioned is to translate 
them into weights to be used by the current fair scheduler. Any other 
scheduling algorithm, including EDF, also can be plugged in and do the 
scheduling based on a function of the endtime and other variables. The other 
variables could include the size of the job, as discussed above.

> Fair Scheduler: Add policy for Earliest Deadline First
> ------------------------------------------------------
>                 Key: YARN-1969
>                 URL: https://issues.apache.org/jira/browse/YARN-1969
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Maysam Yabandeh
>            Assignee: Maysam Yabandeh
> What we are observing is that some big jobs with many allocated containers 
> are waiting for a few containers to finish. Under *fair-share scheduling* 
> however they have a low priority since there are other jobs (usually much 
> smaller, new comers) that are using resources way below their fair share, 
> hence new released containers are not offered to the big, yet 
> close-to-be-finished job. Nevertheless, everybody would benefit from an 
> "unfair" scheduling that offers the resource to the big job since the sooner 
> the big job finishes, the sooner it releases its "many" allocated resources 
> to be used by other jobs.In other words, what we require is a kind of 
> variation of *Earliest Deadline First scheduling*, that takes into account 
> the number of already-allocated resources and estimated time to finish.
> http://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling
> For example, if a job is using MEM GB of memory and is expected to finish in 
> TIME minutes, the priority in scheduling would be a function p of (MEM, 
> TIME). The expected time to finish can be estimated by the AppMaster using 
> TaskRuntimeEstimator#estimatedRuntime and be supplied to RM in the resource 
> request messages. To be less susceptible to the issue of apps gaming the 
> system, we can have this scheduling limited to *only within a queue*: i.e., 
> adding a EarliestDeadlinePolicy extends SchedulingPolicy and let the queues 
> to use it by setting the "schedulingPolicy" field.

This message was sent by Atlassian JIRA

Reply via email to