Santosh Marella created YARN-2476: ------------------------------------- Summary: Apps are scheduled in random order after RM failover Key: YARN-2476 URL: https://issues.apache.org/jira/browse/YARN-2476 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.1 Environment: Linux Reporter: Santosh Marella
RM HA is configured with 2 RMs. Used FileSystemRMStateStore. Fairscheduler allocation file is configured in yarn-site.xml: <property> <name>yarn.scheduler.fair.allocation.file</name> <value>/opt/mapr/hadoop/hadoop-2.4.1/etc/hadoop/allocation-pools.xml</value> </property> FS allocation-pools.xml: <?xml version="1.0"?> <allocations> <queue name="dev"> <minResources>10000 mb,10vcores</minResources> <maxResources>19000 mb,100vcores</maxResources> <maxRunningApps>5525</maxRunningApps> <weight>4.5</weight> <schedulingPolicy>fair</schedulingPolicy> <fairSharePreemptionTimeout>3600</fairSharePreemptionTimeout> </queue> <queue name="default"> <minResources>10000 mb,10vcores</minResources> <maxResources>19000 mb,100vcores</maxResources> <maxRunningApps>5525</maxRunningApps> <weight>1.5</weight> <schedulingPolicy>fair</schedulingPolicy> <fairSharePreemptionTimeout>3600</fairSharePreemptionTimeout> </queue> <defaultMinSharePreemptionTimeout>600</defaultMinSharePreemptionTimeout> <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout> </allocations> Submitted 10 sleep jobs to a FS queue using the command: hadoop jar hadoop-mapreduce-examples-2.4.1-mapr-4.0.1-SNAPSHOT.jar sleep -Dmapreduce.job.queuename=root.dev -m 10 -r 10 -mt 10000 -rt 10000 All the jobs were submitted by the same user, with the same priority and to the same queue. No other jobs were running in the cluster. Jobs started executing in the order in which they were submitted (jobs 6 to 10 were active, while 11 to 15 were waiting): root@perfnode131:/opt/mapr/hadoop/hadoop-2.4.1/logs# yarn application -list Total number of applications (application-types: [] and states: [SUBMITTED,ACCEPTED, RUNNING]):10 Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL application_1408572781346_0012 Sleep job MAPREDUCE userA root.dev ACCEPTED UNDEFINED 0% N/A application_1408572781346_0014 Sleep job MAPREDUCE userA root.dev ACCEPTED UNDEFINED 0% N/A application_1408572781346_0011 Sleep job MAPREDUCE userA root.dev ACCEPTED UNDEFINED 0% N/A application_1408572781346_0010 Sleep job MAPREDUCE userA root.dev RUNNING UNDEFINED 5% http://perfnode132:52799 application_1408572781346_0008 Sleep job MAPREDUCE userA root.dev RUNNING UNDEFINED 5% http://perfnode131:33766 application_1408572781346_0009 Sleep job MAPREDUCE userA root.dev RUNNING UNDEFINED 5% http://perfnode132:50964 application_1408572781346_0007 Sleep job MAPREDUCE userA root.dev RUNNING UNDEFINED 5% http://perfnode134:52966 application_1408572781346_0015 Sleep job MAPREDUCE userA root.dev ACCEPTED UNDEFINED 0% N/A application_1408572781346_0006 Sleep job MAPREDUCE userA root.dev RUNNING UNDEFINED 9.5% http://perfnode134:34094 application_1408572781346_0013 Sleep job MAPREDUCE userA root.dev ACCEPTED UNDEFINED 0% N/A Stopped RM1. There was a failover and RM2 became active. But the jobs seem to have started in a different order: root@perfnode131:~/scratch/raw_rm_logs_fs_hang# yarn application -list 14/08/21 07:26:13 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2 Total number of applications (application-types: [] and states: [SUBMITTED,ACCEPTED, RUNNING]):10 Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL application_1408572781346_0012 Sleep job MAPREDUCE userA root.dev RUNNING UNDEFINED 5%http://perfnode134:59351 application_1408572781346_0014 Sleep job MAPREDUCE userA root.dev RUNNING UNDEFINED 5%http://perfnode132:37866 application_1408572781346_0011 Sleep job MAPREDUCE userA root.dev RUNNING UNDEFINED 5%http://perfnode131:59744 application_1408572781346_0010 Sleep job MAPREDUCE userA root.dev ACCEPTED UNDEFINED 0%N/A application_1408572781346_0008 Sleep job MAPREDUCE userA root.dev ACCEPTED UNDEFINED 0%N/A application_1408572781346_0009 Sleep job MAPREDUCE userA root.dev ACCEPTED UNDEFINED 0%N/A application_1408572781346_0007 Sleep job MAPREDUCE userA root.dev ACCEPTED UNDEFINED 0%N/A application_1408572781346_0015 Sleep job MAPREDUCE userA root.dev RUNNING UNDEFINED 5%http://perfnode134:39754 application_1408572781346_0006 Sleep job MAPREDUCE userA root.dev ACCEPTED UNDEFINED 0%N/A application_1408572781346_0013 Sleep job MAPREDUCE userA root.dev RUNNING UNDEFINED 5%http://perfnode132:34714 The problem is this: - The jobs that were previously in RUNNING state moved to ACCEPTED after failover. - The jobs that were previously in ACCEPTED state moved to RUNNING after failover. -- This message was sent by Atlassian JIRA (v6.2#6252)