[
https://issues.apache.org/jira/browse/YARN-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14173346#comment-14173346
]
Tsuyoshi OZAWA commented on YARN-1879:
--------------------------------------
Talked with Jian offline.
{quote}
In this case, token is expired for the application after finishing AM's
container and I think we don't need to handle it.
{quote}
I'd like to confirm whether finishApplicationMaster() can be issued after AM
containers exit. There are no such case, but finishApplicationMaster() can be
issued after RM's removing AM's entry in a following case:
1. RM1 saves the app in RMStateStore and then crashes.
2. FinishApplicationMasterResponse#isRegistered still return false.
3. The AM still retries the 2nd RM.
Thanks very much for clarifying, Jian. Attached a updated patch which includes
a test for retried finishApplicationMaster and a test for retried
registerApplicationMaster before and after RM-restart.
> Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol for RM
> fail over
> ------------------------------------------------------------------------------------
>
> Key: YARN-1879
> URL: https://issues.apache.org/jira/browse/YARN-1879
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Reporter: Jian He
> Assignee: Tsuyoshi OZAWA
> Priority: Critical
> Attachments: YARN-1879.1.patch, YARN-1879.1.patch,
> YARN-1879.11.patch, YARN-1879.12.patch, YARN-1879.13.patch,
> YARN-1879.14.patch, YARN-1879.15.patch, YARN-1879.16.patch,
> YARN-1879.17.patch, YARN-1879.18.patch, YARN-1879.19.patch,
> YARN-1879.2-wip.patch, YARN-1879.2.patch, YARN-1879.20.patch,
> YARN-1879.21.patch, YARN-1879.22.patch, YARN-1879.23.patch,
> YARN-1879.23.patch, YARN-1879.24.patch, YARN-1879.25.patch,
> YARN-1879.3.patch, YARN-1879.4.patch, YARN-1879.5.patch, YARN-1879.6.patch,
> YARN-1879.7.patch, YARN-1879.8.patch, YARN-1879.9.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)