[ https://issues.apache.org/jira/browse/YARN-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14950630#comment-14950630 ]
Rohith Sharma K S commented on YARN-4243: ----------------------------------------- Thanks [~xgong] for working on this. Some comments and suggestions # While initializing Elector service createConnection will retry as per configured value i.e *maxRetryNum* say 10. But if session is closed and reestablished then number of retry count will be *maxRetryNum* * *maxRetryNum* i.e 10*10=100 times. # And method {{reEstablishSession()}} can be reused rather duplicating same logic over embedded electors. Instead of overriding createConnection() method, reEstablishSession() method can be used in ActiveStandByElector constructor.I'd prefer to make change in hadoop-common rather in embedded elector service. > Add retry on establishing Zookeeper conenction in > EmbeddedElectorService#serviceInit > ------------------------------------------------------------------------------------ > > Key: YARN-4243 > URL: https://issues.apache.org/jira/browse/YARN-4243 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Reporter: Xuan Gong > Assignee: Xuan Gong > Attachments: YARN-4243.1.patch > > > Right now, the RM would shut down if the zk connection is down when the RM do > the initialization. We need to add retry on this part -- This message was sent by Atlassian JIRA (v6.3.4#6332)