[
https://issues.apache.org/jira/browse/YARN-7138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16150036#comment-16150036
]
Wangda Tan commented on YARN-7138:
----------------------------------
[~djp],
Thanks for suggestions.
Regarding to scheduler of K8S, it works quite different comparing to YARN: K8S
supports customized scheduler or even multiple schedulers concurrently running
so POD can choose to use different schedulers when it is submitted to the
cluster.
However, OTOH, In K8S, there's no official "scheduler" API. IIRC, schedulers in
K8S runs in a separate process and makes decisions by:
1) Watch POD update. (via IPC)
2) Watch NODE update. (via IPC)
3) Call POD's binding API to bind a POD to node when allocation decision made.
(via IPC).
So we can say, scheduler of K8S makes decision by invoking POD/binding to
assign POD to node. That's the only contract for customized scheduler, which is
not descriptive enough but very simple to maintain compatibility.
However since YARN scheduler is too much coupled with other RM components, so
it is almost impossible to keep API stability of scheduler, and beyond API,
scheduler need to handle internal RM events, which we never ensure
compatibility before. To be frank, I don't see big benefit of making scheduler
API stable now, adding customized scheduler is never recommended by Hadoop
community and vendors. And I can expect declare stability of scheduler API
could slow down innovations in scheduler side.
Had some discussions with [~curino] / [~subru] / [~kkaranasos] during Hadoop
summit regarding to how to better design scheduler semantics, I suggest to
declare scheduler API stability once we reach there. I don't think it can be
done in the short term.
Thoughts? cc: [~curino] / [~subru] / [~kkaranasos].
> Fix incompatible API change for YarnScheduler involved by YARN-5521
> -------------------------------------------------------------------
>
> Key: YARN-7138
> URL: https://issues.apache.org/jira/browse/YARN-7138
> Project: Hadoop YARN
> Issue Type: Bug
> Components: scheduler
> Reporter: Junping Du
> Priority: Critical
>
> From JACC report for 2.8.2 against 2.7.4, it indicates that we have
> incompatible changes happen in YarnScheduler:
> {noformat}
> hadoop-yarn-server-resourcemanager-2.7.4.jar, YarnScheduler.class
> package org.apache.hadoop.yarn.server.resourcemanager.scheduler
> YarnScheduler.allocate ( ApplicationAttemptId p1, List<ResourceRequest> p2,
> List<ContainerId> p3, List<String> p4, List<String> p5 ) [abstract] :
> Allocation
> {noformat}
> The root cause is YARN-5221. We should change it back or workaround this by
> adding back original API (mark as deprecated if not used any more).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]