[
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15332601#comment-15332601
]
Nathan Roberts commented on YARN-5215:
--------------------------------------
Thanks [~elgoiri] for the work. Maybe Summit would be a good time to get
interested parties together to settle on a direction?
I do see this being very similar to what YARN-5202 is doing. In fact I think if
we just removed the lower bounds in YARN-5202 (i.e. allow it to go below a
node's declared resource), it would effectively accomplish the same thing. e.g.
if a memory hungry process starts up on a node, node utilization will increase
beyond the desired thresholds and the node's resource available for scheduling
will be reduced. In my mind, we should basically set a utilization target and
then have schedulerNode adjust the node's resource either up or down depending
on where we are in relation to the target. The inputs used to decide if and by
how-much a node's resource should be adjusted, is where I think it's
interesting.
Regarding the patch. At least on Linux I think we have to be careful about
aggregating all of the container utilizations together. A simple example where
I think this might not do the right thing is a large MR job that is looking up
data in a large mmap'ed lookup table. RSS as calculated via /proc/<pid>/stat
does not understand shared pages (afaik). This means we'll be double-counting
this mmap'ed file for every container running on the node. We're frequently
running 50+ containers on a node so if this job has lots of tasks running on a
node, we'd have 10's of GB of error. I know we keep it from going negative
which is impportant, but in this case we'll underestimate the amount of
external resource running on the node.
{noformat}
+ externalUtilization = ResourceUtilization.newInstance(nodeUtilization);
+ externalUtilization.subtractFrom(
+ containersUtilization.getPhysicalMemory(),
+ containersUtilization.getVirtualMemory(),
+ containersUtilization.getCPU());
{noformat}
> Scheduling containers based on external load in the servers
> -----------------------------------------------------------
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the
> resources. The proposal is to use the utilization information in the node and
> the containers to estimate how much is consumed by external processes and
> schedule based on this estimation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]