[ 
https://issues.apache.org/jira/browse/YARN-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980275#comment-14980275
 ] 

Sunil G commented on YARN-4308:
-------------------------------

Thanks [~kasha] and [~djp] for clarifying the same.
As part of YARN-4292, we were trying to get the ResourceUtilization from Nodes 
through REST api. And I came across getting -ve CPU usage for Containers 
ResourceUtilization at the start time when containers were just allocated. I 
thought it may confuse user, hence raised the same. Showing a snippet from the 
REST o/p (proposed sample o/p)
{noformat}
nodePhysicalMemoryMB: 4641
nodeVirtualMemoryMB: 0
nodeCPUUsage: 10.576282501220703
containersPhysicalMemoryMB: 1297
containersVirtualMemoryMB: 0
containersCPUUsage: -1.925473
{noformat}
 But I understood the discussion happened in YARN-3304, about showing 0 may 
give an assumption that all metrics works fine. Yes, Its correct.
But here I think scenario is slightly different. When 
{{CpuTimeTracker#getCpuTrackerUsagePercent}} is called for first time, 
*lastSampleTime* will not be available. Hence the method returns -ve values 
always for first time. and this propagates till RM through heartbeat as -ve 
value and some more calculations were also happened during this for the metric 
such as the calculation of {{milliVcoresUsed}} in {{ContainerMonitorImpl}}. I 
think some specific handling can be done for this as this will happen always 
first time compared to a genuine non-availability of metric. How do you feel?
{code}
 public float getCpuTrackerUsagePercent() {
    if (lastSampleTime == UNAVAILABLE ||
        lastSampleTime > sampleTime) {
      // lastSampleTime > sampleTime may happen when the system time is changed
      lastSampleTime = sampleTime;
      lastCumulativeCpuTime = cumulativeCpuTime;
      return cpuUsage;
    }
...
{code}


> ContainersAggregated CPU resource utilization reports negative usage in first 
> few heartbeats
> --------------------------------------------------------------------------------------------
>
>                 Key: YARN-4308
>                 URL: https://issues.apache.org/jira/browse/YARN-4308
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.7.1
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: 0001-YARN-4308.patch
>
>
> NodeManager reports ContainerAggregated CPU resource utilization as -ve value 
> in first few heartbeats cycles. I added a new debug print and received below 
> values from heartbeats.
> {noformat}
> INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
>  ContainersResource Utilization : CpuTrackerUsagePercent : -1.0 
> INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:ContainersResource
>  Utilization :  CpuTrackerUsagePercent : 198.94598
> {noformat}
> Its better we send 0 as CPU usage rather than sending a negative values in 
> heartbeats eventhough its happening in only first few heartbeats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to