[
https://issues.apache.org/jira/browse/YARN-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16780108#comment-16780108
]
Wilfred Spiegelenburg commented on YARN-6487:
---------------------------------------------
The removal of continuous scheduling was/is based on performance numbers and
locking issues.
Continuous scheduling was introduced to help speed up allocating containers in
a small cluster that did not have a large number of heartbeats coming in. This
would happen in clusters that were running a mixed load of containers with an
emphasis on longer running containers. In those clusters the NM heartbeats
would hold up assigning containers when a burst of requests would come in.
The side effect is however that when a cluster grows (100+ nodes) the number of
heartbeats that needed processing started interfering with the continuous
scheduling thread and other internal threads. This does cause thread starvation
and in the worst case scheduling comes to a standstill.
The improvements that have been made in the scheduler that now allows you to
assign multiple containers per heartbeat and still spread the load over
multiple nodes have made continuous scheduling unneeded in all but the smallest
clusters. In those clusters changing NM heartbeat intervals can be used to
workaround that.
So we really do not need it anymore. If turned on in large clusters it can
cause a lot of side effect that is why we decided to deprecate it.
We could think about completely decoupling scheduling from the NM heartbeat to
remove the locking but that would be a far bigger task which affects all
schedulers.
> FairScheduler: remove continuous scheduling (YARN-1010)
> -------------------------------------------------------
>
> Key: YARN-6487
> URL: https://issues.apache.org/jira/browse/YARN-6487
> Project: Hadoop YARN
> Issue Type: Task
> Components: fairscheduler
> Affects Versions: 2.7.0
> Reporter: Wilfred Spiegelenburg
> Assignee: Wilfred Spiegelenburg
> Priority: Major
>
> Remove deprecated FairScheduler continuous scheduler code
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]