[
https://issues.apache.org/jira/browse/YARN-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13764287#comment-13764287
]
Jason Lowe commented on YARN-540:
---------------------------------
bq. The solution is to not report success to user until services have stopped.
Note that delaying reporting success to downstream consumers isn't always
possible, as success can be reported via other means than JobClient directly.
For example, the _SUCCESS file written as part of FileOutputCommitter's commit
processing indicates to others that the job succeeded. IIRC Oozie can poll for
this as part of determining whether a job succeeded. I suspect other
committers have their own methods of notifying downstream consumers that the
job succeeded. And we shouldn't be unregistering from the RM before committing.
As such I think there will always be races where the YARN and MR app states can
end up inconsistent because a job could notify others of success and then fail
before it can notify YARN. We may still want to delay reporting success to
JobClient, but I don't think it completely solves the issue.
> Race condition causing RM to potentially relaunch already unregistered AMs on
> RM restart
> ----------------------------------------------------------------------------------------
>
> Key: YARN-540
> URL: https://issues.apache.org/jira/browse/YARN-540
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Reporter: Jian He
> Assignee: Jian He
> Attachments: YARN-540.1.patch, YARN-540.2.patch, YARN-540.3.patch,
> YARN-540.4.patch, YARN-540.5.patch, YARN-540.6.patch, YARN-540.patch,
> YARN-540.patch
>
>
> When job succeeds and successfully call finishApplicationMaster, RM shutdown
> and restart-dispatcher is stopped before it can process REMOVE_APP event. The
> next time RM comes back, it will reload the existing state files even though
> the job is succeeded
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira