[jira] [Commented] (YARN-4088) RM should be able to process heartbeats from NM concurrently

Jason Lowe (JIRA) Fri, 02 Mar 2018 06:07:19 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16383597#comment-16383597
 ]


Jason Lowe commented on YARN-4088:
----------------------------------

bq. if  the OOB heartbeat could reduce the ability of ResourceManager schedule?

OOB heartbeats are designed to improve the latency of scheduling in the RM.  
Without them, the RM will not know about recently freed resources on a node 
until the next node heartbeat.  That heartbeat could be several seconds away 
depending upon the configured NM heartbeat interval.

The real issue isn't so much responding to heartbeats concurrently as it is 
_scheduling concurrently_.  The heartbeats are technically processed in 
parallel today.  Each heartbeat is handled by an IPC Server handler thread, and 
for the most part the processing in that Server handler thread does not block.  
However that heartbeat posts node update events to the scheduler event 
dispatcher, and processing those events in the scheduler thread is serialized.  
That's where the scaling bottleneck lies.

The node update scheduler events are coalesced if the scheduler thread is not 
keeping up, so I don't see OOB heartbeats as a hinderance to the scheduler.  
Since this was filed the NM now OOB heartbeats on any type of container 
completion, so this should allow the NM hearbeat interval to scale much higher 
(i.e.: on the order of tens of seconds or minutes) without an appreciable 
reduction in scheduling latency.



> RM should be able to process heartbeats from NM concurrently
> ------------------------------------------------------------
>
>                 Key: YARN-4088
>                 URL: https://issues.apache.org/jira/browse/YARN-4088
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager, scheduler
>            Reporter: Srikanth Kandula
>            Priority: Major
>
> Today, the RM sequentially processes one heartbeat after another. 
> Imagine a 3000 server cluster with each server heart-beating every 3s. This 
> gives the RM 1ms on average to process each NM heartbeat. That is tough.
> It is true that there are several underlying datastructures that will be 
> touched during heartbeat processing. So, it is non-trivial to parallelize the 
> NM heartbeat. Yet, it is quite doable...
> Parallelizing the NM heartbeat would substantially improve the scalability of 
> the RM, allowing it to either 
> a) run larger clusters or 
> b) support faster heartbeats or dynamic scaling of heartbeats
> c) take more asks from each application or 
> c) use cleverer/ more expensive algorithms such as node labels or better 
> packing or ...
> Indeed the RM's scalability limit has been cited as the motivating reason for 
> a variety of efforts which will become less needed if this can be solved. 
> Ditto for slow heartbeats.  See Sparrow and Mercury papers for example.
> Can we take a shot at this?
> If not, could we discuss why.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4088) RM should be able to process heartbeats from NM concurrently

Reply via email to