Karam Singh created YARN-4606:
---------------------------------
Summary: Sometimes Fairness inconjuncttions with UserLimitPercent
and UserLimitFactor in queue leads to situation where it appears that
applications in queue are getting starved or stuck
Key: YARN-4606
URL: https://issues.apache.org/jira/browse/YARN-4606
Project: Hadoop YARN
Issue Type: Bug
Components: capacity scheduler, capacityscheduler
Affects Versions: 2.7.1, 2.8.0
Reporter: Karam Singh
Encountered while studying behaviour fairness with UserLimitPercent and
UserLimitFactor during following test:
Ran GridMix with Queue settings: Capacity=10, MaxCap=80, UserLimit=25
UserLimitFactor=32, FairOrderingPolicy only. Encountered a application starving
situation where 33 application (190 apps completed out of 761 apps, queue can
345 containers) are running with total of 45 containers running, and that 12
extra only one app(the app was having around 18000 tasks) , all other apps were
having AM running only no other containers were given any apps. After that app
finished, there were 32 AMs that kept running without any containers for task
being launched
GridMix was run with following settings:
gridmix.client.pending.queue.depth=10, gridmix.job-submission.policy=REPLAY,
gridmix.client.submit.threads=5, gridmix.submit.multiplier=0.0001,
gridmix.job.type=SLEEPJOB, mapreduce.framework.name=yarn,
mapreduce.job.queuename=hive1, mapred.job.queue.name=hive1,
gridmix.sleep.max-map-time=5000, gridmix.sleep.max-reduce-time=5000,
gridmix.user.resolve.class=org.apache.hadoop.mapred.gridmix.RoundRobinUserResolver
With Users file containing 4 users for RoundRobinUserResolver
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)