[
https://issues.apache.org/jira/browse/YARN-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259131#comment-15259131
]
Daniel Templeton commented on YARN-4308:
----------------------------------------
I think it would make sense to test that the negative values are properly
ignored.
I saw that [~kasha] said the pathological case of always getting a negative
value should not occur, but I'm a still little concerned about that case. If
it happens, there will be no externally visible signs as to why the reports are
being skipped. Taking the daemon down to turn on debugging may well change the
state, leaving a confused end user. Is there a way that we can drop an obvious
flag in the logs if the issue persists? Like maybe if we skip _n_ reports in a
row, log a warning?
> ContainersAggregated CPU resource utilization reports negative usage in first
> few heartbeats
> --------------------------------------------------------------------------------------------
>
> Key: YARN-4308
> URL: https://issues.apache.org/jira/browse/YARN-4308
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 2.7.1
> Reporter: Sunil G
> Assignee: Sunil G
> Attachments: 0001-YARN-4308.patch, 0002-YARN-4308.patch
>
>
> NodeManager reports ContainerAggregated CPU resource utilization as -ve value
> in first few heartbeats cycles. I added a new debug print and received below
> values from heartbeats.
> {noformat}
> INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
> ContainersResource Utilization : CpuTrackerUsagePercent : -1.0
> INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:ContainersResource
> Utilization : CpuTrackerUsagePercent : 198.94598
> {noformat}
> Its better we send 0 as CPU usage rather than sending a negative values in
> heartbeats eventhough its happening in only first few heartbeats.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)