[ 
https://issues.apache.org/jira/browse/YARN-9770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16918909#comment-16918909
 ] 

Eric Payne commented on YARN-9770:
----------------------------------

{quote}When A's utilization reaches utilization of other queues (e.g. queue B), 
queue B starts getting allocations too
{quote}
Okay. Thanks. I understand now, [~jhung]. Basically, you want resources to be 
allocated to Queue B while Queue A is coming up to the same %age of usage of 
QueueB. This seems to violate the charter of a capacity scheduler, but since 
this is a pluggable ordering policy, it shouldn't affect customers using either 
of the default or priority utilization ordering policies. Based on that, I have 
no objection.
{quote}however I think this will break preemption. When there're two queues A 
and B. A uses more than guaranteed and have pending resource, B uses less than 
guaranteed and has pending resource.
{quote}
As [~leftnoteasy] points out, this will cause unwanted preemptions, assigning 
preempted containers back to the same queue. This will cause lost work for the 
preempted containers and cause jobs to take longer. It seems that 
{{RandomQueueOrderingPolicy}} is completely incompatible with preemption. I 
would advise that we automatically disable preemption on queue hierarchies that 
enable {{RandomQueueOrderingPolicy}}

> Create a queue ordering policy which picks child queues with equal probability
> ------------------------------------------------------------------------------
>
>                 Key: YARN-9770
>                 URL: https://issues.apache.org/jira/browse/YARN-9770
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Jonathan Hung
>            Assignee: Jonathan Hung
>            Priority: Major
>              Labels: release-blocker
>         Attachments: YARN-9770.001.patch, YARN-9770.002.patch, 
> YARN-9770.003.patch, activeUsers_overlay.png
>
>
> Ran some simulations with the default queue_utilization_ordering_policy:
> An underutilized queue which receives an application with many (thousands) 
> resource requests will hog scheduler allocations for a long time (on the 
> order of a minute). In the meantime apps are getting submitted to all other 
> queues, which increases activeUsers in these queues, which drops user limit 
> in these queues to small values if minimum-user-limit-percent is configured 
> to small values (e.g. 10%).
> To avoid this issue, we assign to queues with equal probability, to avoid 
> scenarios where queues don't get allocations for a long time.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to