[
https://issues.apache.org/jira/browse/YARN-929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe resolved YARN-929.
-----------------------------
Resolution: Duplicate
This is an issue with the MRAppMaster, currently tracked by MAPREDUCE-5396.
> 2 MRAppMaster running parallely for same Job Id
> -----------------------------------------------
>
> Key: YARN-929
> URL: https://issues.apache.org/jira/browse/YARN-929
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 2.0.5-alpha
> Reporter: rohithsharma
>
> Configuration :
> yarn.resourcemanager.am.max-retries = 3
> Scenario is
> NodeManager is killed forcefully i.e using kill -9 NM_PID.
> After Node expiry , RM killed all the container running in this
> NodeManager.
> But , MRAppMaster JVM is still running.
> RM spawn the 2nd attempt MRAppMaster since am retry is configured as 3.
> At this point, there are 2 MRAppMaster is running parallely for same job Id
> Problem from running 2 MRApp is 1st attempt appmaster deletes the job
> information from hdfs which cause FileNotFoundException for 2nd attempt
> MRApp.
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira