[
https://issues.apache.org/jira/browse/YARN-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13968487#comment-13968487
]
Tsuyoshi OZAWA commented on YARN-1861:
--------------------------------------
[~kasha], I think this problem looks very similar to YARN-1929 - deadlock after
losing ZK session.
(*ASE#processResult* -> *EES#becomeStandby* -> *AS#transitionToStandby* ->
*RM#transitionToStandby*) and (RM#serviceStop -> RM.super#serviceStop ->
*RM.super#stop* -> AS#stop -> *AS#serviceStop* -> *EES#serviceStop* ->
*ASE#quitElection*)
IIUC, Karthik's patch on YARN-1929 partially solve this problem, but not
completely. Please correct me if I get wrong. Thanks.
> Both RM stuck in standby mode when automatic failover is enabled
> ----------------------------------------------------------------
>
> Key: YARN-1861
> URL: https://issues.apache.org/jira/browse/YARN-1861
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Affects Versions: 2.4.0
> Reporter: Arpit Gupta
> Assignee: Vinod Kumar Vavilapalli
> Priority: Critical
>
> In our HA tests we noticed that the tests got stuck because both RM's got
> into standby state and no one became active.
--
This message was sent by Atlassian JIRA
(v6.2#6252)