[
https://issues.apache.org/jira/browse/YARN-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13758096#comment-13758096
]
Jason Lowe commented on YARN-540:
---------------------------------
Yes, I realize that 1) and 2) are at a high level accomplishing the same thing.
However 2) requires cooperation from the AM which is user code and therefore
harder to control while 1) does not. There is the issue of RPC threads getting
blocked which may necessitate 2), but otherwise 1) would be preferable since it
requires less cooperation/coordination with the AMs.
> Race condition causing RM to potentially relaunch already unregistered AMs on
> RM restart
> ----------------------------------------------------------------------------------------
>
> Key: YARN-540
> URL: https://issues.apache.org/jira/browse/YARN-540
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Reporter: Jian He
> Assignee: Jian He
> Attachments: YARN-540.1.patch, YARN-540.2.patch, YARN-540.patch,
> YARN-540.patch
>
>
> When job succeeds and successfully call finishApplicationMaster, RM shutdown
> and restart-dispatcher is stopped before it can process REMOVE_APP event. The
> next time RM comes back, it will reload the existing state files even though
> the job is succeeded
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira