[ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15332601#comment-15332601
 ] 

Nathan Roberts commented on YARN-5215:
--------------------------------------

Thanks [~elgoiri] for the work. Maybe Summit would be a good time to get 
interested parties together to settle on a direction?

I do see this being very similar to what YARN-5202 is doing. In fact I think if 
we just removed the lower bounds in YARN-5202 (i.e. allow it to go below a 
node's declared resource), it would effectively accomplish the same thing. e.g. 
if a memory hungry process starts up on a node, node utilization will increase 
beyond the desired thresholds and the node's resource available for scheduling 
will be reduced. In my mind,  we should basically set  a utilization target and 
then have schedulerNode adjust the node's resource either up or down depending 
on where we are in relation to the target. The inputs used to decide if and by 
how-much a node's resource should be adjusted, is where I think it's 
interesting.

Regarding the patch. At least on Linux I think we have to be careful about 
aggregating all of the container utilizations together. A simple example where 
I think this might not do the right thing is a large MR job that is looking up 
data in a large mmap'ed lookup table. RSS as calculated via /proc/<pid>/stat 
does not understand shared pages (afaik). This means we'll be double-counting 
this mmap'ed file for every container running on the node. We're frequently 
running 50+ containers on a node so if this job has lots of tasks running on a 
node, we'd have 10's of GB of error.  I know we keep it from going negative 
which is impportant, but in this case we'll underestimate the amount of 
external resource running on the node. 
{noformat}
+      externalUtilization = ResourceUtilization.newInstance(nodeUtilization);
+      externalUtilization.subtractFrom(
+          containersUtilization.getPhysicalMemory(),
+          containersUtilization.getVirtualMemory(),
+          containersUtilization.getCPU());
{noformat}


> Scheduling containers based on external load in the servers
> -----------------------------------------------------------
>
>                 Key: YARN-5215
>                 URL: https://issues.apache.org/jira/browse/YARN-5215
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Inigo Goiri
>         Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to