[ 
https://issues.apache.org/jira/browse/YARN-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1222:
-----------------------------------

    Attachment: yarn-1222-6.patch

Here is an updated patch that:
# Creates a new event type for failed store operations.
# RMDispatcher handles these failed-store-op-events - transitions to standby on 
fenced exception; shuts the RM down otherwise
# Mark VisibleForTesting methods in ZKRMStateStore @Private @Unstable

Pending:
# Documentation in yarn-default.xml
# Manual testing on a real cluster
# Create a JIRA to change RMStateStore#notifyDone* methods to not take an 
Exception

[~bikassaha] - please take a look when you get a chance. I ll address any 
feedback in the next patch. Thanks.

> Make improvements in ZKRMStateStore for fencing
> -----------------------------------------------
>
>                 Key: YARN-1222
>                 URL: https://issues.apache.org/jira/browse/YARN-1222
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Karthik Kambatla
>         Attachments: yarn-1222-1.patch, yarn-1222-2.patch, yarn-1222-3.patch, 
> yarn-1222-4.patch, yarn-1222-5.patch, yarn-1222-6.patch
>
>
> Using multi-operations for every ZK interaction. 
> In every operation, automatically creating/deleting a lock znode that is the 
> child of the root znode. This is to achieve fencing by modifying the 
> create/delete permissions on the root znode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to