[ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371947#comment-15371947
 ] 

Inigo Goiri edited comment on YARN-5356 at 7/12/16 12:15 AM:
-------------------------------------------------------------

In general, we have 3 values:
* Actual resources of the full machine. This currently comes from 
{{NodeManagerHardwareUtils}} if I remember correctly. For example, it can be 12 
cores.
* Resource available for the Node Manager. This is currently defined in 
yarn-site.xml with key {{yarn.nodemanager.resource.cpu-vcores}} or with the 
{{updateNodeResource()}}. For example, 6 cores.
* Actual utilization of the machine. This is extracted in the 
{{NodeResourceMonitor}} with the {{ResourceCalculatorPlugin}}. And it can be 
400%, which would imply 4 out of the 12 cores used.

[~nroberts], I understand that your problem is that with the current approach 
you know that you have 6 cores available to the NM and 4 of them are used. 
However, the machine is not that utilized (~30%). Correct? In that case, we 
would only need to report the actual size of the machine at registration time 
as it would never change. Not sure that {{ResourceUtilization}} would be the 
right place for that as it would be reported in every heartbeat continuously.



was (Author: elgoiri):
In general, we have 3 values:
* Actual resources of the full machine. This currently comes from 
{{NodeManagerHardwareUtils}} if I remember correctly. For example, it can be 12 
cores for example
* Resource available for the Node Manager. This is currently defined in 
yarn-site.xml with key {{yarn.nodemanager.resource.cpu-vcores}} or with the 
{{updateNodeResource()}}. For example, 6 cores.
* Actual utilization of the machine. This is extracted in the 
{{NodeResourceMonitor}} with the {{ResourceCalculatorPlugin}}. And it can be 
400%, which would imply 4 out of the 12 cores used.

[~nroberts], I understand that your problem is that with the current approach 
you know that you have 6 cores available to the NM and 4 of them are used. 
However, the machine is not that utilized (~30%). Correct? In that case, we 
would only need to report the actual size of the machine at registration time 
as it would never change. Not sure that {{ResourceUtilization}} would be the 
right place for that as it would be reported in every heartbeat continuously.


> ResourceUtilization should also include resource availability
> -------------------------------------------------------------
>
>                 Key: YARN-5356
>                 URL: https://issues.apache.org/jira/browse/YARN-5356
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager, resourcemanager
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Nathan Roberts
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if it also included how much of 
> that resource is actually available on the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to