[ 
https://issues.apache.org/jira/browse/YARN-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reassigned YARN-1861:
---------------------------------------------

    Assignee: Karthik Kambatla  (was: Xuan Gong)

Looking at this as it is blocking 2.4.1.

Assigning to Karthik given he has done the core code change. Will credit Xuan 
too.

I tried to just apply the test-case and run it without the core change and was 
expecting the active RM to go to standby and the standby RM to go to active 
once the originally active RM is fenced. Instead I get a NPE somewhere. Can the 
test be fixed to do so?

Also, we need to make sure that when automatic failover is enabled, all 
external interventions like a fence like this bug (and forced-manual failover 
from CLI?) do a similar reset into the leader election. There may not be cases 
like this today though..

> Both RM stuck in standby mode when automatic failover is enabled
> ----------------------------------------------------------------
>
>                 Key: YARN-1861
>                 URL: https://issues.apache.org/jira/browse/YARN-1861
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 2.4.0
>            Reporter: Arpit Gupta
>            Assignee: Karthik Kambatla
>            Priority: Blocker
>         Attachments: YARN-1861.2.patch, YARN-1861.3.patch, YARN-1861.4.patch, 
> YARN-1861.5.patch, yarn-1861-1.patch, yarn-1861-6.patch
>
>
> In our HA tests we noticed that the tests got stuck because both RM's got 
> into standby state and no one became active.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to