[ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15328631#comment-15328631
 ] 

Inigo Goiri commented on YARN-5215:
-----------------------------------

[~sunilg], I think your first two points are related. Let me try to give a full 
example of what it is now and what it would become with this. Right now, if we 
have a node with 16GB, we usually set the usable to the NM 
({{yarn.nodemanager.resource.memory-mb}}) to a smaller number like 14GB; the 
idea behind this is that the other services (NM itself and DN) can potentially 
use 2GB.

With this new approach, we can potentially set 
{{yarn.nodemanager.resource.memory-mb}} to 16GB and if the external processes 
consume 1GB, the NM can only allocate containers up to 15GB. Note that in our 
actual deployment, we set {{yarn.nodemanager.resource.memory-mb}} to something 
like 15GB to have some reserve to handle spikes.

Regarding your third point, we can add some kind of EWMA but I'm open to other 
proposals for smoothing spikes in the utilization numbers.

> Scheduling containers based on external load in the servers
> -----------------------------------------------------------
>
>                 Key: YARN-5215
>                 URL: https://issues.apache.org/jira/browse/YARN-5215
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Inigo Goiri
>         Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to