Xuan Gong commented on YARN-4107:

The ActiveStandbyElector would go to enterNeutralMode if it lost connection. At 
Zookeeper side, it would choose a new leader as the old leader lost connection. 
In that case, the old standby RM would become active, but the old active RM 
would keep trying to reconnect to Zookeeper until timeout. So, we would have 
two active RMs.

The bad impact is: all the new applications still try to connect to old active 
RM and stay in NEW state only. Because the old active RM have already lost the 
connection with ZK, so it can not save the app states in zk state store.

> Both RM becomes Active if all zookeepers can not connect to active RM
> ---------------------------------------------------------------------
>                 Key: YARN-4107
>                 URL: https://issues.apache.org/jira/browse/YARN-4107
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
> Steps to reproduce:
> 1) Run small randomwriter applications in background
> 2) rm1 is active and rm2 is standby 
> 3) Disconnect all Zks and Active RM
> 4) Check status of both RMs. Both of them are in active state

This message was sent by Atlassian JIRA

Reply via email to