[
https://issues.apache.org/jira/browse/YARN-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096645#comment-15096645
]
Karthik Kambatla commented on YARN-4512:
----------------------------------------
bq. do you think is it make sense to let RM make the over-subscription decision?
As mentioned
[here|https://issues.apache.org/jira/browse/YARN-1011?focusedCommentId=15072223&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15072223],
here is my reasoning for it being a config on the NM:
Even if we have the knob on the RM, the node still has to support it: monitor
the resource usage on the node and kill the OPPORTUNISTIC containers if need
be. On a cluster with NMs of different versions (say, during a rolling
upgrade), the RM will have to keep track of NMs that support over-subscription.
So, we do need some config for the NM anyway. Further, there could be
node-specific conditions - hardware (e.g. GPU), other services running on the
node etc. - that could affect the over-subscription capacity of the node. For
instance, it might be okay to sign up for 90% of the advertised capacity on
node A, but only 80% on node B. And, this ability to soak up extra work could
change over time.
bq. And what's the "threshold" means in ResourceThresholds?
Fair question. We plan to use it for both overAllocationThreshold (utilization
upto which the RM allocates OPPORTUNISTIC containers) and preemptionThreshold
(utilization beyond which the node kills enough OPPORTUNISTIC containers).
My bad, I am updating the design doc to capture these thresholds and the
initial scheduling/promotion policy. Should have it up today or tomorrow.
> Provide a knob to turn on over-allocation
> -----------------------------------------
>
> Key: YARN-4512
> URL: https://issues.apache.org/jira/browse/YARN-4512
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager
> Reporter: Karthik Kambatla
> Assignee: Karthik Kambatla
> Attachments: YARN-4512-YARN-1011.001.patch,
> yarn-4512-yarn-1011.002.patch, yarn-4512-yarn-1011.003.patch
>
>
> We need two configs for overallocation - one to specify the threshold upto
> which it is okay to over-allocate, another to specify the threshold after
> which OPPORTUNISTIC containers should be preempted.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)