[
https://issues.apache.org/jira/browse/YARN-7739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323332#comment-16323332
]
Wangda Tan edited comment on YARN-7739 at 1/12/18 1:05 AM:
-----------------------------------------------------------
I personally prefer to not update global's maximum allocation by node's
availabilities by default and reject requests if it exceeds maximum allocation.
Thoughts? [~jlowe] / [~asuresh] / [~sunilg] / [~templedf] / [~yufeigu].
was (Author: leftnoteasy):
I personally prefer to not update global's maximum allocation by node's
availabilities by default and reject requests if it exceeds maximum allocation.
Thoughts? [~jlowe] / [~asuresh] / [~sunilg].
> Revisit scheduler resource normalization behavior for max allocation
> --------------------------------------------------------------------
>
> Key: YARN-7739
> URL: https://issues.apache.org/jira/browse/YARN-7739
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Wangda Tan
> Priority: Critical
>
> Currently, YARN Scheduler normalizes requested resource based on the maximum
> allocation derived from configured maximum allocation and maximum registered
> node resources. Basically, the scheduler will silently cap asked resource by
> maximum allocation.
> This could cause issues for applications, for example, a Spark job which
> needs 12 GB memory to run, however in the cluster, registered NMs have at
> most 8 GB mem on each node. So scheduler allocates 8GB memory container to
> the requested application.
> Once app receives containers from RM, if it doesn't double check allocated
> resources, it will lead to OOM and hard to debug because scheduler silently
> caps maximum allocation.
> When non-mandatory resources introduced, this becomes worse. For resources
> like GPU, we typically set minimum allocation to 0 since not all nodes have
> GPU devices. So it is possible that application asks 4 GPUs but get 0 GPU, it
> gonna be a big problem.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]