Peng Zhang created YARN-3414:

             Summary: FairScheduler's preemption may cause livelock
                 Key: YARN-3414
             Project: Hadoop YARN
          Issue Type: Bug
          Components: fairscheduler
    Affects Versions: 2.6.0
            Reporter: Peng Zhang

I met this problem in our cluster, it cause livelock during preemption and 

Queue hierarchy described as below:
              /        |        \
          queue-1    queue-2    queue-3     
          /    \
queue-1-1      queue-1-2
# Assume cluster resource is 100G in memory
# Assume queue-1 has max resource limit 20G
# queue-1-1 is active and it will get max 20G memory(equal to its fairshare)
# queue-2 is active then, and it require 30G memory(less than its fairshare)
# queue-3 is active, and it can be assigned with all other resources, 50G 
memory(larger than its fairshare). At here three queues' fair share is (20, 40, 
40), and usage is (20, 30, 50)
# queue-1-2 is active, it will cause new preemption request(10G memory and 
intuitively it can only preempt from its sibling queue-1-1)
# Actually preemption starts from root, and it will find queue-3 is most over 
fairshare, and preempt some resources form queue-3.
# But during scheduling, it will find queue-1 itself arrived it's max 
fairshare, and cannot assign resource to it. Then resource's again assigned to 
And then it repeats between last two steps.

This message was sent by Atlassian JIRA

Reply via email to