[
https://issues.apache.org/jira/browse/YARN-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13994957#comment-13994957
]
Carlo Curino commented on YARN-2022:
------------------------------------
Sunil, I am travelling abroad till 26th (please forgive delays)... I could only
skim the patch from a mobile device. It looks reasonable, a concern I have is
that we rely on a user set Priority to choose whether to preempt or not. Unless
there are check in place preventing the user from abusing this value, this is
egregiously gameable (set my containers all to AM priority and get away with
murder).
Also I thought more about the possible corner cases, after conversation with
Chris Douglas, and Mayank: we should keep an eye out for max percentage of
resources dedicated to AMs... we should "save" the AMs from earlier
(higher-pri) applications up till the max % of AM we can allocate in the Queue,
and at the very least not "protect" the AMs past that point. Similar check
should be in place for userLimitFactor. Without this it is entirely possible
that a queue is wedged with 100% AMs or that a user has in its AM more
resources than he deserve (and it is systematically skipped, even if the
cluster is empty). We have seen some of this in particular extreme test cases
(espilon-size queues, many apps "moved" to a queue etc...).
Please share your thoughts on this...
> Preempting an Application Master container can be kept as least priority when
> multiple applications are marked for preemption by
> ProportionalCapacityPreemptionPolicy
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: YARN-2022
> URL: https://issues.apache.org/jira/browse/YARN-2022
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: resourcemanager
> Affects Versions: 2.4.0
> Reporter: Sunil G
> Assignee: Sunil G
> Attachments: Yarn-2022.1.patch
>
>
> Cluster Size = 16GB [2NM's]
> Queue A Capacity = 50%
> Queue B Capacity = 50%
> Consider there are 3 applications running in Queue A which has taken the full
> cluster capacity.
> J1 = 2GB AM + 1GB * 4 Maps
> J2 = 2GB AM + 1GB * 4 Maps
> J3 = 2GB AM + 1GB * 2 Maps
> Another Job J4 is submitted in Queue B [J4 needs a 2GB AM + 1GB * 2 Maps ].
> Currently in this scenario, Jobs J3 will get killed including its AM.
> It is better if AM can be given least priority among multiple applications.
> In this same scenario, map tasks from J3 and J2 can be preempted.
> Later when cluster is free, maps can be allocated to these Jobs.
--
This message was sent by Atlassian JIRA
(v6.2#6252)