[
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15325713#comment-15325713
]
Sunil G commented on YARN-5215:
-------------------------------
Hi [~elgoiri]
Thanks for initiating this. This looks useful. I have some comments on this.
- Eventhough we have a minimumAllocation from Scheduler, its better we define a
deadzone around the delta (nodeUtilization - containersUtilization) as it may
help to avoid thrashing.
- externalUtilization is considered as follows
{code}
+ externalUtilization = ResourceUtilization.newInstance(nodeUtilization);
+ externalUtilization.subtractFrom(
+ containersUtilization.getPhysicalMemory(),
+ containersUtilization.getVirtualMemory(),
+ containersUtilization.getCPU());
{code}
Please correct me if I understood wrongly as I think there is a corner case.
Assume a node where 16GB memory is available and only 8Gb is assigned to
NodeManager. And this node has some other process also running. So if 4GB is
used by such external process, I think node's {{getUnallocatedResource}} will
come as {{8GB(NM configured capacity) - 4GB (external process)}}.
{code}
Resources.subtractFrom(unallocatedResource, externalResource);
{code}
I think NodeResourceMonitorImpl seems returning resourceUtilization of whole
node. Its not capping with Node's configured capacity.
- This is a suggestion. As per current design, we are jumping to the possible
unallocated resource in a node fast. Will it be better if we reach to this
aggregated unallocated limit after checking few cycles of Node Utilization?
> Scheduling containers based on external load in the servers
> -----------------------------------------------------------
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the
> resources. The proposal is to use the utilization information in the node and
> the containers to estimate how much is consumed by external processes and
> schedule based on this estimation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]