[
https://issues.apache.org/jira/browse/YARN-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14162875#comment-14162875
]
Karthik Kambatla commented on YARN-1879:
----------------------------------------
Thanks for the updates, Tsuyoshi. Sorry to vacillate on this JIRA.
Chatted with Jian offline about marking the methods in question Idempotent vs
AtMostOnce. I believe we agreed on "Methods that identify and ignore duplicate
requests as duplicate should be AtMostOnce, and those that repeat the method
without any adverse side-effects should be Idempotent. Retries on RM
restart/failover are the same for Idempotent and AtMostOnce methods."
Following that, registerApplicationMaster should be Idempotent, allocate and
finishApplicationMaster should be AtMostOnce. Given all the methods handle
duplicate requests, the retry-cache is not necessary but could be an
optimization we can pursue/investigate on another JIRA.
Review comments on the patch:
# Nit: Not added in this patch, can we rename
TestApplicationMasterServiceProtocolOnHA#initiate to initialize()?
# IIUC, the tests (ProtocolHATestBase) induce a failover while processing one
of these requests. We should also probably add a test that makes duplicate
requests to the same/different RM and verify the behavior is as expected.
Correct me if existing tests already do this.
> Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol for RM
> fail over
> ------------------------------------------------------------------------------------
>
> Key: YARN-1879
> URL: https://issues.apache.org/jira/browse/YARN-1879
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Reporter: Jian He
> Assignee: Tsuyoshi OZAWA
> Priority: Critical
> Attachments: YARN-1879.1.patch, YARN-1879.1.patch,
> YARN-1879.11.patch, YARN-1879.12.patch, YARN-1879.13.patch,
> YARN-1879.14.patch, YARN-1879.15.patch, YARN-1879.16.patch,
> YARN-1879.17.patch, YARN-1879.18.patch, YARN-1879.19.patch,
> YARN-1879.2-wip.patch, YARN-1879.2.patch, YARN-1879.20.patch,
> YARN-1879.21.patch, YARN-1879.22.patch, YARN-1879.23.patch,
> YARN-1879.23.patch, YARN-1879.3.patch, YARN-1879.4.patch, YARN-1879.5.patch,
> YARN-1879.6.patch, YARN-1879.7.patch, YARN-1879.8.patch, YARN-1879.9.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)