[ 
https://issues.apache.org/jira/browse/YARN-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14963572#comment-14963572
 ] 

Junping Du commented on YARN-4277:
----------------------------------

I don't think this is the duplication of YARN-4273 as that one is about the 
leak of application (with step to kill application from users). This scenario 
should be well addressed as NM should recover and report the running containers 
to RM no matter if it has failover or not.
[~sandflee], do we see this is a real problem in the cluster? If so, would you 
put on more details, like: some exceptions in nm/rm log for us to understand 
what's going wrong there? Thanks!
CC to [~jlowe].

> containers would be leaked if nm crashed  and rm failover
> ---------------------------------------------------------
>
>                 Key: YARN-4277
>                 URL: https://issues.apache.org/jira/browse/YARN-4277
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: sandflee
>
> nm restart and rm ha is enabled.
> 1,  nm crashed, after timeout, rm send container complete msg to 
> corresponding AM.
> 2, rm failovers
> 3, nm restart and register to RM , recovering containers running on NM, these 
> containers and leaked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to