[
https://issues.apache.org/jira/browse/YARN-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karthik Kambatla updated YARN-1222:
-----------------------------------
Attachment: yarn-1222-6.patch
Here is an updated patch that:
# Creates a new event type for failed store operations.
# RMDispatcher handles these failed-store-op-events - transitions to standby on
fenced exception; shuts the RM down otherwise
# Mark VisibleForTesting methods in ZKRMStateStore @Private @Unstable
Pending:
# Documentation in yarn-default.xml
# Manual testing on a real cluster
# Create a JIRA to change RMStateStore#notifyDone* methods to not take an
Exception
[~bikassaha] - please take a look when you get a chance. I ll address any
feedback in the next patch. Thanks.
> Make improvements in ZKRMStateStore for fencing
> -----------------------------------------------
>
> Key: YARN-1222
> URL: https://issues.apache.org/jira/browse/YARN-1222
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Bikas Saha
> Assignee: Karthik Kambatla
> Attachments: yarn-1222-1.patch, yarn-1222-2.patch, yarn-1222-3.patch,
> yarn-1222-4.patch, yarn-1222-5.patch, yarn-1222-6.patch
>
>
> Using multi-operations for every ZK interaction.
> In every operation, automatically creating/deleting a lock znode that is the
> child of the root znode. This is to achieve fencing by modifying the
> create/delete permissions on the root znode.
--
This message was sent by Atlassian JIRA
(v6.1#6144)