[
https://issues.apache.org/jira/browse/YARN-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13997159#comment-13997159
]
Karthik Kambatla commented on YARN-2054:
----------------------------------------
On a cluster with RM HA and buggy RM, this led to a long wait before failover.
> Poor defaults for YARN ZK configs for retries and retry-inteval
> ---------------------------------------------------------------
>
> Key: YARN-2054
> URL: https://issues.apache.org/jira/browse/YARN-2054
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 2.4.0
> Reporter: Karthik Kambatla
> Assignee: Karthik Kambatla
>
> Currenly, we have the following default values:
> # yarn.resourcemanager.zk-num-retries - 500
> # yarn.resourcemanager.zk-retry-interval-ms - 2000
> This leads to a cumulate 1000 seconds before the RM gives up trying to
> connect to the ZK.
--
This message was sent by Atlassian JIRA
(v6.2#6252)