[
https://issues.apache.org/jira/browse/YARN-3630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562211#comment-14562211
]
Xianyin Xin commented on YARN-3630:
-----------------------------------
Thanks for your comments, [~vvasudev]!
{quote}
your patch doesn't check if the calculated interval is greater than the ping
interval to determine liveliness for the AM and the NM. Is that by design?
{quote}
It's true we should do that. But in this patch I haven't add the mechanism of
determining a up limit for the {{nextHeartbeatInterval}}. I think the limit
should much less than the ping interval which is 10 minutes by default. Other
hand, do you think a hard configurable limit is accepted?
{quote}
With respect to adaptive heartbeats for the NMs - my concern is that the
proposed solution will lead to behaviour where the NMs will be told to back off
- the NMs will wait for sometime - the RM will receive a flood of NM updates -
leading to the NMs being told to back off and so on and so forth. We'll end up
in a situation where the pings will become clustered around particular time
intervals, leading to container allocation and release delays. You might be
better off picking a random interval between the default interval and the
calculated interval to spread out the NM pings
{quote}
Thanks for reminding, it's a situation I didn't think much. I think your
suggestion is a nice choice.
> YARN should suggest a heartbeat interval for applications
> ---------------------------------------------------------
>
> Key: YARN-3630
> URL: https://issues.apache.org/jira/browse/YARN-3630
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: resourcemanager, scheduler
> Affects Versions: 2.7.0
> Reporter: Zoltán Zvara
> Assignee: Xianyin Xin
> Priority: Minor
> Attachments: Notes_for_adaptive_heartbeat_policy.pdf,
> YARN-3630.001.patch.patch, YARN-3630.002.patch
>
>
> It seems currently applications - for example Spark - are not adaptive to RM
> regarding heartbeat intervals. RM should be able to suggest a desired
> heartbeat interval to applications.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)