[ 
https://issues.apache.org/jira/browse/YARN-7138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16150036#comment-16150036
 ] 

Wangda Tan commented on YARN-7138:
----------------------------------

[~djp], 

Thanks for suggestions.

Regarding to scheduler of K8S, it works quite different comparing to YARN: K8S 
supports customized scheduler or even multiple schedulers concurrently running 
so POD can choose to use different schedulers when it is submitted to the 
cluster.

However, OTOH, In K8S, there's no official "scheduler" API. IIRC, schedulers in 
K8S runs in a separate process and makes decisions by:
1) Watch POD update. (via IPC)
2) Watch NODE update. (via IPC)
3) Call POD's binding API to bind a POD to node when allocation decision made. 
(via IPC).

So we can say, scheduler of K8S makes decision by invoking POD/binding to 
assign POD to node. That's the only contract for customized scheduler, which is 
not descriptive enough but very simple to maintain compatibility.

However since YARN scheduler is too much coupled with other RM components, so 
it is almost impossible to keep API stability of scheduler, and beyond API, 
scheduler need to handle internal RM events, which we never ensure 
compatibility before. To be frank, I don't see big benefit of making scheduler 
API stable now, adding customized scheduler is never recommended by Hadoop 
community and vendors. And I can expect declare stability of scheduler API 
could slow down innovations in scheduler side.

Had some discussions with [~curino] / [~subru] / [~kkaranasos] during Hadoop 
summit regarding to how to better design scheduler semantics, I suggest to 
declare scheduler API stability once we reach there. I don't think it can be 
done in the short term. 

Thoughts? cc: [~curino] / [~subru] / [~kkaranasos]. 



> Fix incompatible API change for YarnScheduler involved by YARN-5521
> -------------------------------------------------------------------
>
>                 Key: YARN-7138
>                 URL: https://issues.apache.org/jira/browse/YARN-7138
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: scheduler
>            Reporter: Junping Du
>            Priority: Critical
>
> From JACC report for 2.8.2 against 2.7.4, it indicates that we have 
> incompatible changes happen in YarnScheduler:
> {noformat}
> hadoop-yarn-server-resourcemanager-2.7.4.jar, YarnScheduler.class
> package org.apache.hadoop.yarn.server.resourcemanager.scheduler
> YarnScheduler.allocate ( ApplicationAttemptId p1, List<ResourceRequest> p2, 
> List<ContainerId> p3, List<String> p4, List<String> p5 ) [abstract]  :  
> Allocation 
> {noformat}
> The root cause is YARN-5221. We should change it back or workaround this by 
> adding back original API (mark as deprecated if not used any more).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to