[ 
https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136503#comment-17136503
 ] 

Adam Antal commented on YARN-9930:
----------------------------------

I was trying to make a meaningful review, but stuck on a few questions. 
Apologize if I'm making silly questions.

I am a little nervous about this case:
bq. Limit max-parallel-apps to 4, submit 4 apps, then refresh it to 2. Result: 
running apps were still running, but new apps stayed in Accepted state. From 
that point on, only 2 apps were allowed to run at the same time.
So AFAIU it is absolutely normal that some queue is above its limit if the 
configurations have been changed. Doesn't it need some special attention in 
your algorithm when you recursively update the parents to search for queues 
where new apps could be submitted?

I compared your implementation with the max apps one, it's a bit different. You 
use a separate {{CSMaxRunningAppsEnforcer}} instance in the scheduler which is 
optimized for guessing which queues to check whether their limits enabled more 
apps to run. The existing implementation for max apps (that considers both 
running and pending ones) calls the 
{{OrderingPolicy#getNumSchedulableEntities()}} and compare it the to limit 
inside {{LeafQueue}}. From the algorithm you described above I assume that your 
solution is more effective, but it seems to me that calling these methods of 
{{OrderingPolicy}} in {{LeafQueue#validateSubmitApplication}} already does 
similar things, but from the queue's perspective - while your solution is 
fundamentally implemented inside the scheduler. I'd prefer your solution as its 
more clear, but since we already have the existing logic, the questions arises: 
why do we need a separate enforcer object? Couldn't it be implemented 
similarly? Or am I missing something here?

Nit:
- {{abstract int getNumRunnableApps();}} would be better put into the 
{{CSQueue}} interface instead of {{AbstractCSQueue}} abstract class.

> Support max running app logic for CapacityScheduler
> ---------------------------------------------------
>
>                 Key: YARN-9930
>                 URL: https://issues.apache.org/jira/browse/YARN-9930
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacity scheduler, capacityscheduler
>    Affects Versions: 3.1.0, 3.1.1
>            Reporter: zhoukang
>            Assignee: Peter Bacsko
>            Priority: Major
>         Attachments: YARN-9930-001.patch, YARN-9930-002.patch, 
> YARN-9930-003.patch, YARN-9930-004.patch, YARN-9930-POC01.patch, 
> YARN-9930-POC02.patch, YARN-9930-POC03.patch, YARN-9930-POC04.patch, 
> YARN-9930-POC05.patch, screenshot-1.png
>
>
> In FairScheduler, there has limitation for max running which will let 
> application pending.
> But in CapacityScheduler there has no feature like max running app.Only got 
> max app,and jobs will be rejected directly on client.
> This jira i want to implement this semantic for CapacityScheduler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to