[ 
https://issues.apache.org/jira/browse/YARN-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-1776:
------------------------------

    Attachment: YARN-1776.1.patch

I created a patch:

1. Add updateRMDelegationTokenAndSequenceNumber to RMStateStore.

2. For MemoryRMStateStore, we don't need to make the method atomic as the 
memory is lost when RM fails. Therefore, it just a simple wrapper of 
storeRMDelegationTokenAndSequenceNumber and removeRMDelegationToken.

3. For ZKRMStateStore,  I make use of opList to group both delete and store 
operations together, to ensure all or no operations get succeeded.

4. For FileSystemRMStateStore, it is a difficult case: since we're not just 
touching a single file, it's hard to make all or no fs operations succeed. 
Therefore, I just leave it as what I've done for MemoryRMStateStore. Meanwhile, 
storeRMDelegationTokenAndSequenceNumber itself is not atomic as well.

The good thing, is that RM failover is supposed to work with ZK impl. Hopefully 
it is still OK. Thoughts?

5. RMDelegationTokenSecretManager#updateStoredToken calls 
updateRMDelegationTokenAndSequenceNumber then.

6. Add the test for updateRMDelegationTokenAndSequenceNumber.

> renewDelegationToken should survive RM failover
> -----------------------------------------------
>
>                 Key: YARN-1776
>                 URL: https://issues.apache.org/jira/browse/YARN-1776
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Zhijie Shen
>            Assignee: Zhijie Shen
>         Attachments: YARN-1776.1.patch
>
>
> When a delegation token is renewed, two RMStateStore operations: 1) removing 
> the old DT, and 2) storing the new DT will happen. If RM fails in between. 
> There would be problem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to