[
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15328631#comment-15328631
]
Inigo Goiri commented on YARN-5215:
-----------------------------------
[~sunilg], I think your first two points are related. Let me try to give a full
example of what it is now and what it would become with this. Right now, if we
have a node with 16GB, we usually set the usable to the NM
({{yarn.nodemanager.resource.memory-mb}}) to a smaller number like 14GB; the
idea behind this is that the other services (NM itself and DN) can potentially
use 2GB.
With this new approach, we can potentially set
{{yarn.nodemanager.resource.memory-mb}} to 16GB and if the external processes
consume 1GB, the NM can only allocate containers up to 15GB. Note that in our
actual deployment, we set {{yarn.nodemanager.resource.memory-mb}} to something
like 15GB to have some reserve to handle spikes.
Regarding your third point, we can add some kind of EWMA but I'm open to other
proposals for smoothing spikes in the utilization numbers.
> Scheduling containers based on external load in the servers
> -----------------------------------------------------------
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the
> resources. The proposal is to use the utilization information in the node and
> the containers to estimate how much is consumed by external processes and
> schedule based on this estimation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]