[
https://issues.apache.org/jira/browse/YARN-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116452#comment-16116452
]
Yang Wang commented on YARN-6212:
---------------------------------
Hi, Miklos Szegedi
I'm afraid this JIRA is not a duplicate of YARN-3933.
The primary cause of negative values is that metrics do not recover properly
when NM restart.
*AllocatedContainers,ContainersLaunched,AllocatedGB,AvailableGB,AllocatedVCores,AvailableVCores*
in metrics need to recover when NM restart.
This should be done in ContainerManagerImpl#recoverContainer.
The scenario could be reproduction by the following steps:
# Make sure
YarnConfiguration.NM_RECOVERY_ENABLED=true,YarnConfiguration.NM_RECOVERY_SUPERVISED=true
in NM
# Submit an application and keep running
# Restart NM
# Stop the application
# Now you get the negative values
> NodeManager metrics returning wrong negative values
> ---------------------------------------------------
>
> Key: YARN-6212
> URL: https://issues.apache.org/jira/browse/YARN-6212
> Project: Hadoop YARN
> Issue Type: Bug
> Components: metrics
> Affects Versions: 2.7.3
> Reporter: Abhishek Shivanna
>
> It looks like the metrics returned by the NodeManager have negative values
> for metrics that never should be negative. Here is an output form NM endpoint
> {noformat}
> /jmx?qry=Hadoop:service=NodeManager,name=NodeManagerMetrics
> {noformat}
> {noformat}
> {
> "beans" : [ {
> "name" : "Hadoop:service=NodeManager,name=NodeManagerMetrics",
> "modelerType" : "NodeManagerMetrics",
> "tag.Context" : "yarn",
> "tag.Hostname" : "<HOST>",
> "ContainersLaunched" : 707,
> "ContainersCompleted" : 9,
> "ContainersFailed" : 124,
> "ContainersKilled" : 579,
> "ContainersIniting" : 0,
> "ContainersRunning" : 19,
> "AllocatedGB" : -26,
> "AllocatedContainers" : -5,
> "AvailableGB" : 252,
> "AllocatedVCores" : -5,
> "AvailableVCores" : 101,
> "ContainerLaunchDurationNumOps" : 718,
> "ContainerLaunchDurationAvgTime" : 18.0
> } ]
> }
> {noformat}
> Is there any circumstance under which the value for AllocatedGB,
> AllocatedContainers and AllocatedVCores go below 0?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]