[ 
https://issues.apache.org/jira/browse/YARN-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14958994#comment-14958994
 ] 

Karthik Kambatla commented on YARN-4243:
----------------------------------------

Looks like we are addressing two issues here:
# Have createConnection() retry connecting to ZK. 
## I am with Rohith on this one - I think changing ActiveStandbyElector 
constructor either to use reestablishConnection or otherwise seems like the 
right approach. Do we know why the HDFS devs don't want connections to be 
retried on init, but are fine with it on reestablishConnection?
# Add a config to be able to set a different number of retries for Yarn. 
## Sounds reasonable. Code comments - can we do the following instead:
{code}
int maxRetryNum = 
conf.getInt(YarnConfiguration.RM_HA_FC_ELECTOR_ZK_OP_RETRIES_KEY,
                                             
conf.getInt(CommonConfigurationKeys.HA_FC_ELECTOR_ZK_OP_RETRIES_KEY,
                                                               
CommonConfigurationKeys.HA_FC_ELECTOR_ZK_OP_RETRIES_DEFAULT));
{code}


> Add retry on establishing Zookeeper conenction in 
> EmbeddedElectorService#serviceInit
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-4243
>                 URL: https://issues.apache.org/jira/browse/YARN-4243
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>         Attachments: YARN-4243.1.patch, YARN-4243.2.1.patch, 
> YARN-4243.2.patch, YARN-4243.3.patch
>
>
> Right now, the RM would shut down if the zk connection is down when the RM do 
> the initialization. We need to add retry on this part



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to