[
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628477#comment-13628477
]
Carlo Curino commented on YARN-45:
----------------------------------
This is still a point we are discussing and it is not fully binded, this is why
is why it comes out confusing and why we were soliciting opinions.
Your observations I think are helping us frame this a bit better. We can see
three possible uses of preemption:
1) A preemption policy that does not necessarily trust the AM, picks containers
and list them as a Set<ContainerID>, and give the AM a heads up on who is going
to die soon if it is not preempted. Note that If the AM is mapreduce this is
not too bad as we know how containers are used (maps before reducers) and so we
can pick containers in a reasonable order. We have been testing a policy that
does this, and works well in our tests. Also this is a perfect match with how
the FairScheduler thinks about preemption.
2) A preemption policy that trusts the AM and specifies preemption as a
Set<ResourceRequest>. This works well for known AMs that we know try to enforce
the preemption requests, and/or if we do not care to force-killing anyway and
preemption requests are best-effort. We have played around with a version of
this too. If I am not mistaken this is also the case you care the most about,
right?
3) A version of 2 which also enforces its preemption-requests via killing if
they are not satisfied within a certain period of time. This is not-trivial to
build as there is inherent ambiguity of how ResourceRequest are mapped to
containers over-time, so the enforcement part is hard to get right / prove
correctness for.
We believe that 3 might be the ideal point of tendency but proving its
correctness is non-trivial and would require deeper surgery to the
RM/Schedulers, for example if in subsequent moment in time I want the same
amount of resources out of an AM it is hard to unambiguously decide whether is
due to an AM not preempting as I asked (just forcibly killing its containers is
fine), or whether this are subsequent and independent request of resources (so
I should not kill but wait).
The proposed protocol, with the change that makes it a tagged union of
Set<ContainerID> and Set<ResourceRequest> seems to allow for all of the above,
and be easy to explain. I will update the patch to fix to reflect this if you
agree.
> Scheduler feedback to AM to release containers
> ----------------------------------------------
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: resourcemanager
> Reporter: Chris Douglas
> Assignee: Carlo Curino
> Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict
> enforcement of resource invariants in the cluster. Individual allocations of
> containers must be reclaimed- or reserved- to restore the global invariants
> when cluster load shifts. In some cases, the ApplicationMaster can respond to
> fluctuations in resource availability without losing the work already
> completed by that task (MAPREDUCE-4584). Supplying it with this information
> would be helpful for overall cluster utilization [1]. To this end, we want to
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira