[ 
https://issues.apache.org/jira/browse/YARN-4408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036485#comment-15036485
 ] 

Junping Du commented on YARN-4408:
----------------------------------

Thanks Robert for updating the patch. Can we make log messages here in WARN 
level given this is unusual case and our log level is only enabled for INFO or 
above by default?

> NodeManager still reports negative running containers
> -----------------------------------------------------
>
>                 Key: YARN-4408
>                 URL: https://issues.apache.org/jira/browse/YARN-4408
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.4.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: YARN-4408.001.patch, YARN-4408.002.patch
>
>
> YARN-1697 fixed a problem where the NodeManager metrics could report a 
> negative number of running containers.  However, it missed a rare case where 
> this can still happen.
> YARN-1697 added a flag to indicate if the container was actually launched 
> ({{LOCALIZED}} to {{RUNNING}}) or not ({{LOCALIZED}} to {{KILLING}}), which 
> is then checked when transitioning from {{CONTAINER_CLEANEDUP_AFTER_KILL}} to 
> {{DONE}} and {{EXITED_WITH_FAILURE}} to {{DONE}} to only decrement the gauge 
> if we actually ran the container and incremented the gauge .  However, this 
> flag is not checked while transitioning from {{EXITED_WITH_SUCCESS}} to 
> {{DONE}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to