rohithsharma created YARN-929:
---------------------------------
Summary: 2 MRAppMaster spawned for same Job Id
Key: YARN-929
URL: https://issues.apache.org/jira/browse/YARN-929
Project: Hadoop YARN
Issue Type: Bug
Components: resourcemanager
Affects Versions: 2.0.5-alpha
Reporter: rohithsharma
Configuration :
yarn.resourcemanager.am.max-retries = 3
Scenario is
NodeManager is killed forcefully i.e using kill -9 NM_PID.
After Node expiry , RM killed all the container running in this NodeManager.
But , MRAppMaster JVM is still running.
RM spawn the 2nd attempt MRAppMaster since am retry is configured as 3.
Problem from running 2 MRApp is 1st attempt appmaster deletes the job
information from hdfs which cause FileNotFoundException for 2nd attempt MRApp.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira