[
https://issues.apache.org/jira/browse/YARN-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13759268#comment-13759268
]
Jian He commented on YARN-540:
------------------------------
bq. Is that behavior change being implemented in the YARN API layer?
IMHO, for work-preserving restart, after RM comes back, RM should be able to
accept the old AM as normal instead of asking the AM to reboot or making NM
kill the AM container(which currently happens). Then on RM side, AM
unregistering just happens like a normal unregistering, even though RM had
restarted.
> Race condition causing RM to potentially relaunch already unregistered AMs on
> RM restart
> ----------------------------------------------------------------------------------------
>
> Key: YARN-540
> URL: https://issues.apache.org/jira/browse/YARN-540
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Reporter: Jian He
> Assignee: Jian He
> Attachments: YARN-540.1.patch, YARN-540.2.patch, YARN-540.3.patch,
> YARN-540.patch, YARN-540.patch
>
>
> When job succeeds and successfully call finishApplicationMaster, RM shutdown
> and restart-dispatcher is stopped before it can process REMOVE_APP event. The
> next time RM comes back, it will reload the existing state files even though
> the job is succeeded
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira