[ 
https://issues.apache.org/jira/browse/YARN-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13979415#comment-13979415
 ] 

Karthik Kambatla commented on YARN-1861:
----------------------------------------

Taking this over. Figured out the issue - an Active RM doesn't intimate the 
elector when it transitions itself to Standby. The elector assumes everything 
is fine with the cluster. The fix is to resetLeaderElection when the RM 
transitions itself to standby. Posting a patch that does that. 

Haven't written any tests yet. Will try to make time and write some. If I am 
not active enough, please feel free to take it over and the tests.

> Both RM stuck in standby mode when automatic failover is enabled
> ----------------------------------------------------------------
>
>                 Key: YARN-1861
>                 URL: https://issues.apache.org/jira/browse/YARN-1861
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 2.4.0
>            Reporter: Arpit Gupta
>            Assignee: Karthik Kambatla
>            Priority: Critical
>
> In our HA tests we noticed that the tests got stuck because both RM's got 
> into standby state and no one became active.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to