[ 
https://issues.apache.org/jira/browse/YARN-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14202287#comment-14202287
 ] 

Jason Lowe commented on YARN-2825:
----------------------------------

Thanks for the patch, Jian!

Is there a reason we need to cast to ContainerImpl?  I think calling 
context.getContainers().get(containerId).getContainerState() == 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerState.DONE
 would be equivalent and cleaner since we wouldn't assume the container 
implementation.  Or we could get the container status and check for COMPLETE 
which is what other parts of the code are doing.

> Container leak on NM
> --------------------
>
>                 Key: YARN-2825
>                 URL: https://issues.apache.org/jira/browse/YARN-2825
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Jian He
>            Assignee: Jian He
>            Priority: Critical
>         Attachments: YARN-2825.1.patch, YARN-2825.1.patch
>
>
> Caused by YARN-1372. thanks [~vinodkv] for pointing  this out.
> The problem is that in YARN-1372 we changed the behavior to remove containers 
> from NMContext only after the containers are acknowledged  by AM. But in the 
> {{NodeStatusUpdaterImpl#removeCompletedContainersFromContext}} call, we 
> didn't check whether the container is really completed or not.  If the 
> container is stilll running, we shouldn't remove the container from the 
> context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to