[
https://issues.apache.org/jira/browse/YARN-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15752695#comment-15752695
]
Wangda Tan commented on YARN-5864:
----------------------------------
Offline discussed with [~vinodkv].
We can have a better semantic of this feature, which is we can add
queue-priority property. (Credit to [~vinodkv] for the idea).
In existing scheduler, we sort queues based on (used-capacity /
configured-capacity). But in some cases we have some apps/services need get
resource first. For example, we allocate 85% to production queue, and 15% to
test queue. When production queue is underutilized, we want scheduler give
resource to production queue first regardless of test queue's utilization.
A rough plan is: we will assign priority to queues under the same parent. Each
time scheduler picks underutilized queue with highest priority, if there's no
underutilized queue, scheduler picks queue with lowest utilization.
And when we do preemption, if queue with higher priority has some special
resource requests, such as very large memory, hard locality, placement
constraint, etc. Scheduler will do relatively *conservative* preemption from
other queues with lower priority regardless of utilization.
That is just a rough idea, [~curino] please let us know your comments. I can
formalize the design once we can agree with the approach generally.
> Capacity Scheduler preemption for fragmented cluster
> -----------------------------------------------------
>
> Key: YARN-5864
> URL: https://issues.apache.org/jira/browse/YARN-5864
> Project: Hadoop YARN
> Issue Type: New Feature
> Reporter: Wangda Tan
> Assignee: Wangda Tan
> Attachments: YARN-5864.poc-0.patch
>
>
> YARN-4390 added preemption for reserved container. However, we found one case
> that large container cannot be allocated even if all queues are under their
> limit.
> For example, we have:
> {code}
> Two queues, a and b, capacity 50:50
> Two nodes: n1 and n2, each of them have 50 resource
> Now queue-a uses 10 on n1 and 10 on n2
> queue-b asks for one single container with resource=45.
> {code}
> The container could be reserved on any of the host, but no preemption will
> happen because all queues are under their limits.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]