[
https://issues.apache.org/jira/browse/YARN-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniel Templeton updated YARN-5694:
-----------------------------------
Attachment: YARN-5694.004.patch
New patch that includes tests. It also changes the RM's behavior a bit, per
the discussion with [~kasha] above. Now, if the RM is not in HA mode and is
forced into standby mode, it will exit to prevent potential state store
corruption. The only scenario where this can happen is when the
{{ZKRMStateStore}} discovers that it has been fenced (not when it has lost
contact with the ZK instance), which is exactly when exiting is the right thing
to do.
> ZKRMStateStore should only start its verification thread when in HA failover
> is not embedded
> --------------------------------------------------------------------------------------------
>
> Key: YARN-5694
> URL: https://issues.apache.org/jira/browse/YARN-5694
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 3.0.0-alpha1
> Reporter: Daniel Templeton
> Assignee: Daniel Templeton
> Attachments: YARN-5694.001.patch, YARN-5694.002.patch,
> YARN-5694.003.patch, YARN-5694.004.patch, YARN-5694.branch-2.7.001.patch,
> YARN-5694.branch-2.7.002.patch, YARN-5694.branch-2.7.003.patch
>
>
> There are two cases. In branch-2.7, the
> {{ZKRMStateStore.VerifyActiveStatusThread}} is always started, even when
> using embedded or Curator failover. In branch-2.8, the
> {{ZKRMStateStore.VerifyActiveStatusThread}} is only started when HA is
> disabled, which makes no sense. Based on the JIRA that introduced that
> change (YARN-4559), I believe the intent was to start it only when embedded
> failover is disabled.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]