[
https://issues.apache.org/jira/browse/YARN-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034376#comment-15034376
]
Wangda Tan commented on YARN-4390:
----------------------------------
Thanks for sharing your thoughts, [~curino]!
I agree with most of what you said: fixing large imbalance is more important
than doing micro corrections. It will be enough when a cluster is large and
resource requests are almost homogeneous, current PCPP can handle such cases
quite well.
But in other cases, for example:
# Resource requests from different queues are very heterogeneous, some requests
need 1G mem only, and some requests need 32G.
# Hard locality is required (for example SLIDER-82).
Existing PCPP cannot work well. I have seen many excessive preemption happens
from a customer's cluster with several hundreds of nodes and requests are
heterogeneous.
So I'm proposing an approach in YARN-4108 which combines the two: Large
imbalance will be calculated by preemption monitor and micro corrections will
be handled by scheduler's allocation logic. I've uploaded doc / POC patch to
YARN-4108, please kindly review.
> Consider container request size during CS preemption
> ----------------------------------------------------
>
> Key: YARN-4390
> URL: https://issues.apache.org/jira/browse/YARN-4390
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacity scheduler
> Affects Versions: 3.0.0, 2.8.0, 2.7.3
> Reporter: Eric Payne
> Assignee: Eric Payne
>
> There are multiple reasons why preemption could unnecessarily preempt
> containers. One is that an app could be requesting a large container (say
> 8-GB), and the preemption monitor could conceivably preempt multiple
> containers (say 8, 1-GB containers) in order to fill the large container
> request. These smaller containers would then be rejected by the requesting AM
> and potentially given right back to the preempted app.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)