[ 
https://issues.apache.org/jira/browse/YARN-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14588827#comment-14588827
 ] 

Wangda Tan commented on YARN-3806:
----------------------------------

Hi [~wshao],
Thanks for providing thoughts about this, I took a quick look at attached 
design doc, some comments (Please correct me if I missed anything).

The JIRA wants to tackle following issues
# Pluggable preemption policy
# Be able to add other allocation policies
# Application level configuration 
# Decouple application / nodes from scheduler

#1/#2, should be already covered by YARN-3306. It doesn't include detailed 
ParentQueue policy in design doc, but we plan to extend it to ParentQueue as 
mentioned in YARN-3306. And I found "preemptResource" and "acquireResource" in 
your design are very close to what we have in CS. If you have time, could you 
take a look at YARN-3318 (which is already committed), ordering policy is part 
of queue policy, is that what you were trying to do?

#3, I'm not sure if it is a valid usecase, I can understand admin can set 
"maximum-limit" of application, but setting minimum-share by app sounds not a 
fair allocation.

#4, Now we have an common abstraction of application 
(SchedulerApplicationAttempt) and node (SchedulerNode) for different scheduler 
implementations. Are you suggesting eliminate scheduler-specific implementation 
such as FiCaSchedulerApp/FiCaSchedulerNode? I think that might be problematic, 
you can think the app is also a pluggable implementation. For instance, FS/CS 
have different logic in app level, such as: how to do delayed scheduling, 
limits, etc.

To details in the design doc:
- Each LeafQueue can run at most one app, that seems not like a "queue". This 
is very restrictive, for example, if there's a requirement needs 1k app running 
at the same time, do you need to configure 1k LeafQueues? And how to choose 
where to submit application?
- YARN-2986 is trying to create a unified view of scheduler configuration, is 
there any overlap between YARN-2986 and configuration model mentioned in your 
doc?

Thoughts?

> Proposal of Generic Scheduling Framework for YARN
> -------------------------------------------------
>
>                 Key: YARN-3806
>                 URL: https://issues.apache.org/jira/browse/YARN-3806
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: scheduler
>            Reporter: Wei Shao
>         Attachments: ProposalOfGenericSchedulingFrameworkForYARN-V1.0.pdf, 
> ProposalOfGenericSchedulingFrameworkForYARN-V1.1.pdf
>
>
> Currently, a typical YARN cluster runs many different kinds of applications: 
> production applications, ad hoc user applications, long running services and 
> so on. Different YARN scheduling policies may be suitable for different 
> applications. For example, capacity scheduling can manage production 
> applications well since application can get guaranteed resource share, fair 
> scheduling can manage ad hoc user applications well since it can enforce 
> fairness among users. However, current YARN scheduling framework doesn’t have 
> a mechanism for multiple scheduling policies work hierarchically in one 
> cluster.
> YARN-3306 talked about many issues of today’s YARN scheduling framework, and 
> proposed a per-queue policy driven framework. In detail, it supported 
> different scheduling policies for leaf queues. However, support of different 
> scheduling policies for upper level queues is not seriously considered yet. 
> A generic scheduling framework is proposed here to address these limitations. 
> It supports different policies for any queue consistently. The proposal tries 
> to solve many other issues in current YARN scheduling framework as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to