[ https://issues.apache.org/jira/browse/YARN-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14588827#comment-14588827 ]
Wangda Tan commented on YARN-3806: ---------------------------------- Hi [~wshao], Thanks for providing thoughts about this, I took a quick look at attached design doc, some comments (Please correct me if I missed anything). The JIRA wants to tackle following issues # Pluggable preemption policy # Be able to add other allocation policies # Application level configuration # Decouple application / nodes from scheduler #1/#2, should be already covered by YARN-3306. It doesn't include detailed ParentQueue policy in design doc, but we plan to extend it to ParentQueue as mentioned in YARN-3306. And I found "preemptResource" and "acquireResource" in your design are very close to what we have in CS. If you have time, could you take a look at YARN-3318 (which is already committed), ordering policy is part of queue policy, is that what you were trying to do? #3, I'm not sure if it is a valid usecase, I can understand admin can set "maximum-limit" of application, but setting minimum-share by app sounds not a fair allocation. #4, Now we have an common abstraction of application (SchedulerApplicationAttempt) and node (SchedulerNode) for different scheduler implementations. Are you suggesting eliminate scheduler-specific implementation such as FiCaSchedulerApp/FiCaSchedulerNode? I think that might be problematic, you can think the app is also a pluggable implementation. For instance, FS/CS have different logic in app level, such as: how to do delayed scheduling, limits, etc. To details in the design doc: - Each LeafQueue can run at most one app, that seems not like a "queue". This is very restrictive, for example, if there's a requirement needs 1k app running at the same time, do you need to configure 1k LeafQueues? And how to choose where to submit application? - YARN-2986 is trying to create a unified view of scheduler configuration, is there any overlap between YARN-2986 and configuration model mentioned in your doc? Thoughts? > Proposal of Generic Scheduling Framework for YARN > ------------------------------------------------- > > Key: YARN-3806 > URL: https://issues.apache.org/jira/browse/YARN-3806 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler > Reporter: Wei Shao > Attachments: ProposalOfGenericSchedulingFrameworkForYARN-V1.0.pdf, > ProposalOfGenericSchedulingFrameworkForYARN-V1.1.pdf > > > Currently, a typical YARN cluster runs many different kinds of applications: > production applications, ad hoc user applications, long running services and > so on. Different YARN scheduling policies may be suitable for different > applications. For example, capacity scheduling can manage production > applications well since application can get guaranteed resource share, fair > scheduling can manage ad hoc user applications well since it can enforce > fairness among users. However, current YARN scheduling framework doesn’t have > a mechanism for multiple scheduling policies work hierarchically in one > cluster. > YARN-3306 talked about many issues of today’s YARN scheduling framework, and > proposed a per-queue policy driven framework. In detail, it supported > different scheduling policies for leaf queues. However, support of different > scheduling policies for upper level queues is not seriously considered yet. > A generic scheduling framework is proposed here to address these limitations. > It supports different policies for any queue consistently. The proposal tries > to solve many other issues in current YARN scheduling framework as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)