Carlo Curino commented on YARN-2009:

I agree with your observation... the set of invariants/semantics (queue 
capacity, max-capacity, user quotas, apps priority, max-am-percentage, 
container size and multi-resources, etc..) cross product with 
preferences/optimizations (spare AMs, node labels, locality, minimize latency 
of jobs, etc..) makes for a vast space of possible policies... Considering the 
hierarchical nature of queues correctly in this makes for an even worse space.

Beside the challenge of writing uber-policies that can handle all that, it is 
very hard to tune right. Even just the simplistic preemption we have today is 
confusing even very competent users (I know for a fact). I worry that more and 
more complexity will get to be unmanageable by most. In a sense I am growing 
fond of a notion of "explainability" of a system behavior, which favor systems 
that one can easily understand/predict the behavior of (to the cost of some 

To this purpose our cut-point in the early design of preemption was to say: 
"preemption should only kick in to correct large imbalances, and operate on a 
rather slow time-scale". The idea was to for example consider that if I am 
preempting 1k containers for you to get your capacity, locality would matter 
less... and so would many minor other issues like local priorities, locality, 
container sizes etc..

Overall, I think we should be use-case driven. If there is a clear "need" for 
complexity to cope with observed issues I think we can add it, but I would 
suggest we refrain from adding too many knobs based on hypothetical scenarios. 
If a need is not present yet, I would propose to require a "sizeable win" as a 
bar for adding knobs.. if we can demonstrate on some non-trivial experimental 
setup that a knob can deliver substantial value than maybe it's ok. 

> Priority support for preemption in ProportionalCapacityPreemptionPolicy
> -----------------------------------------------------------------------
>                 Key: YARN-2009
>                 URL: https://issues.apache.org/jira/browse/YARN-2009
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacityscheduler
>            Reporter: Devaraj K
>            Assignee: Sunil G
> While preempting containers based on the queue ideal assignment, we may need 
> to consider preempting the low priority application containers first.

This message was sent by Atlassian JIRA

Reply via email to