[ https://issues.apache.org/jira/browse/YARN-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995796#comment-13995796 ]
Xuan Gong commented on YARN-1861: --------------------------------- bq. Can we make this explicit, instead of being an NPE? Like doing a client call to find the current active RM or something like that? Yes, we can do that. DONE bq. That is what I was thinking, but I am concerned about locking etc. This code has become a little convoluted. Per Xuan, we seem to be safe for now, so may be look at this separately? Yes. But I will make a note about it. > Both RM stuck in standby mode when automatic failover is enabled > ---------------------------------------------------------------- > > Key: YARN-1861 > URL: https://issues.apache.org/jira/browse/YARN-1861 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Affects Versions: 2.4.0 > Reporter: Arpit Gupta > Assignee: Karthik Kambatla > Priority: Blocker > Attachments: YARN-1861.2.patch, YARN-1861.3.patch, YARN-1861.4.patch, > YARN-1861.5.patch, YARN-1861.7.patch, yarn-1861-1.patch, yarn-1861-6.patch > > > In our HA tests we noticed that the tests got stuck because both RM's got > into standby state and no one became active. -- This message was sent by Atlassian JIRA (v6.2#6252)