[
https://issues.apache.org/jira/browse/YARN-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhijie Shen updated YARN-1776:
------------------------------
Attachment: YARN-1776.1.patch
I created a patch:
1. Add updateRMDelegationTokenAndSequenceNumber to RMStateStore.
2. For MemoryRMStateStore, we don't need to make the method atomic as the
memory is lost when RM fails. Therefore, it just a simple wrapper of
storeRMDelegationTokenAndSequenceNumber and removeRMDelegationToken.
3. For ZKRMStateStore, I make use of opList to group both delete and store
operations together, to ensure all or no operations get succeeded.
4. For FileSystemRMStateStore, it is a difficult case: since we're not just
touching a single file, it's hard to make all or no fs operations succeed.
Therefore, I just leave it as what I've done for MemoryRMStateStore. Meanwhile,
storeRMDelegationTokenAndSequenceNumber itself is not atomic as well.
The good thing, is that RM failover is supposed to work with ZK impl. Hopefully
it is still OK. Thoughts?
5. RMDelegationTokenSecretManager#updateStoredToken calls
updateRMDelegationTokenAndSequenceNumber then.
6. Add the test for updateRMDelegationTokenAndSequenceNumber.
> renewDelegationToken should survive RM failover
> -----------------------------------------------
>
> Key: YARN-1776
> URL: https://issues.apache.org/jira/browse/YARN-1776
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Zhijie Shen
> Assignee: Zhijie Shen
> Attachments: YARN-1776.1.patch
>
>
> When a delegation token is renewed, two RMStateStore operations: 1) removing
> the old DT, and 2) storing the new DT will happen. If RM fails in between.
> There would be problem.
--
This message was sent by Atlassian JIRA
(v6.2#6252)