Sunil G commented on YARN-2009:

I agree with your thoughts [~curino]. Locality constraints based policy making 
is as you told hypothetical, and with a given set of test experiments we can 
see how far its adding value .. I am devising and working on some useful tests 
to see the advantage. However I also felt that this added thought may help 
cluster to work in better way. But now it seems more complicated as weightage 
of choosing which container, is not balanced or straight forward while 
considering all scenarios.

a. Higher priority application needs 7 containers
b. 2 apps in Lower priority has 4 containers(2 each), and 2 apps at Very low 
priority has 4 containers (2 each).

Possible behavior from preemption policy can be:
1. Spare AM containers (Based on config)
2. At Very Low priority, choose application which is last submitted and claim 2 
containers. Then the next app at same level.

This may be the direct output we expect.

However, few thoughts
1. higher priority app may need containers on certain nodes(locality), but the 
preemption happened on other nodes, and thus make a choice of rack local or 
even any. 
2. With node labels, its even possible that the preempted containers fall into 
another set of label on which the demand can't be supplied.
3. User limit factor has to be respected during preemption (queue preemption 
considers this already with a config)
4. A different example, higher priority application needs 2 container of 6GB 
each. 1 lower priority application has 12 containers of 1Gb each, another lower 
priority has 2 container of 6Gb each. With submission time, if we choose 1st 
lower priority app, we may kill more containers. Sometimes a wiser choice is to 
select 2nd one. This is debatable :)
5. Taking first example itself. we have 2 lower priority apps to choose from, 
but based on submission time 1st app is selected for preemption. Its possible 
that this app may be more i/o bounded and finished more % of work than 2nd one 
which is submitted earlier. So submission time alone may not be a good choice, 
% of job completion can be considered.

My point is being there is no single line in which a decision can be made 
sequentially, few customers may be option 2 over option1. Hence a policy with 
lot of config may come in if we accept these as feature. I would like to get 
your thoughts on this, as you told this may not give a big output, but only can 
work as enhancement.

> Priority support for preemption in ProportionalCapacityPreemptionPolicy
> -----------------------------------------------------------------------
>                 Key: YARN-2009
>                 URL: https://issues.apache.org/jira/browse/YARN-2009
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacityscheduler
>            Reporter: Devaraj K
>            Assignee: Sunil G
> While preempting containers based on the queue ideal assignment, we may need 
> to consider preempting the low priority application containers first.

This message was sent by Atlassian JIRA

Reply via email to