[ 
https://issues.apache.org/jira/browse/YARN-4438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059248#comment-15059248
 ] 

Karthik Kambatla commented on YARN-4438:
----------------------------------------

bq. I'll make the change accordingly.
Not sure I understand. Regarding a single client for elector and store, are we 
doing it in this patch or punting to a follow-up. I am open to either options, 
would prefer doing it here. All that is needed is store the CuratorFramework 
instance in RMContext. 

Comments on the patch itself:
# LeaderElectorService
## The instance, rm, is not used anywhere. Why even pass it? The RMContext 
should have all the information that the elector needs or updates?
## Use conf instead of getConfig() here? 
{code}
String zkBasePath = getConfig().get(
        YarnConfiguration.AUTO_FAILOVER_ZK_BASE_PATH,
        YarnConfiguration.DEFAULT_AUTO_FAILOVER_ZK_BASE_PATH);
    latchPath = zkBasePath + "/" + clusterId;
{code}
## isLeader(): update log messages to say - "rmId is elected leader, 
transitioning to active". On failure, "Failed to transition to active, giving 
up leadership"? 
## reJoinElection: why sleep for 1 second? 
## reJoinElection: Also, on exception, it might not be enough to just log it. 
If it is due to close(), don't we want to force give-up so the other RM becomes 
active? If it is on initAndStartLeaderLatch(), this RM will never become 
active; don't we want to just die? 
## How about adding a method called closeLeaderLatch to complement 
initAndStart? That would help us avoid cases like the null check missing in 
rejoinElection?
## notLeader: Again, we should likely do more than just logging.
# YarnConfiguration: If our long-term plan is to keep the curator version and 
get rid of EmbeddedElectorService, may be we should have a config to use 
embedded-elector instead of curator-elector. e.g. 
yarn.resourcemanager.ha.use-active-standby-elector and set it to true by 
default for now? 
# ResourceManager: 
## Nit: Spurious import changes - prefer leaving them out or fixing them in a 
separate patch before/after. 
## Why change the argument to transitionToStandby from true to false? 
## Comment: Looking closer, in the following method, reinitialize(initialize) 
should be called outside the if. No? I am surprised we haven't noticed this 
before. May be, fix it on another JIRA? 
# Looking at the changes in AdminService and ResourceManager, I still feel the 
AdminService should be the one handling the LeaderElectorService. Also, the 
LeaderElectorService talks to AdminService for transitions to active/standby. 


> Implement RM leader election with curator
> -----------------------------------------
>
>                 Key: YARN-4438
>                 URL: https://issues.apache.org/jira/browse/YARN-4438
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Jian He
>            Assignee: Jian He
>         Attachments: YARN-4438.1.patch, YARN-4438.2.patch, YARN-4438.3.patch
>
>
> This is to implement the leader election with curator instead of the 
> ActiveStandbyElector from common package,  this also avoids adding more 
> configs in common to suit RM's own needs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to