[ https://issues.apache.org/jira/browse/YARN-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967740#comment-14967740 ]
Wangda Tan commented on YARN-4280: ---------------------------------- Thanks [~kshukla] reporting this issue. I think this problem could be resolved if preemption is enabled. For clusters disabled preemption, I think maybe relax the check of parent queue's max capacity is one solution. Currently reservation happens only if parentQueue.used + reserved <= parentQueue.max. We can relax this check to: parentQueue.used < parentQueue.max. However, this could have other impacts such as total-reserved + total-allocated > total-used. > CapacityScheduler reservations may not prevent indefinite postponement on a > busy cluster > ---------------------------------------------------------------------------------------- > > Key: YARN-4280 > URL: https://issues.apache.org/jira/browse/YARN-4280 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler > Affects Versions: 2.6.1, 2.8.0, 2.7.1 > Reporter: Kuhu Shukla > Assignee: Kuhu Shukla > > Consider the following scenario: > There are 2 queues A(25% of the total capacity) and B(75%), both can run at > total cluster capacity. There are 2 applications, appX that runs on Queue A, > always asking for 1G containers(non-AM) and appY runs on Queue B asking for 2 > GB containers. > The user limit is high enough for the application to reach 100% of the > cluster resource. > appX is running at total cluster capacity, full with 1G containers releasing > only one container at a time. appY comes in with a request of 2GB container > but only 1 GB is free. Ideally, since appY is in the underserved queue, it > has higher priority and should reserve for its 2 GB request. Since this > request puts the alloc+reserve above total capacity of the cluster, > reservation is not made. appX comes in with a 1GB request and since 1GB is > still available, the request is allocated. > This can continue indefinitely causing priority inversion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)