[ https://issues.apache.org/jira/browse/YARN-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13764689#comment-13764689 ]
Jason Lowe commented on YARN-540: --------------------------------- JobClient is the "standard" APIs. I don't mean to imply we shouldn't try to improve that situation, rather that there are many out-of-band notifications in use and therefore fixing JobClient doesn't solve the problem in the general sense. Job end notification (see mapreduce.job.end-notification.url) is another mechanism used to notify clients of job completion. Currently this is done before unregistering, but we could move it to after unregistering. The failure mode then changes such that an AM that crashes after unregistering but before notifying could end up never notifying a client because the RM would not retry. However job end notification is currently best-effort and not guaranteed, and most frameworks I'm familiar with that are using it have a polling fallback (via something like JobClient) in case the notification fails to arrive. > Race condition causing RM to potentially relaunch already unregistered AMs on > RM restart > ---------------------------------------------------------------------------------------- > > Key: YARN-540 > URL: https://issues.apache.org/jira/browse/YARN-540 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Reporter: Jian He > Assignee: Jian He > Attachments: YARN-540.1.patch, YARN-540.2.patch, YARN-540.3.patch, > YARN-540.4.patch, YARN-540.5.patch, YARN-540.6.patch, YARN-540.patch, > YARN-540.patch > > > When job succeeds and successfully call finishApplicationMaster, RM shutdown > and restart-dispatcher is stopped before it can process REMOVE_APP event. The > next time RM comes back, it will reload the existing state files even though > the job is succeeded -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira