Rohith Sharma K S created YARN-5279:
---------------------------------------

             Summary: Potential Container leak in NM in preemption flow
                 Key: YARN-5279
                 URL: https://issues.apache.org/jira/browse/YARN-5279
             Project: Hadoop YARN
          Issue Type: Bug
          Components: nodemanager, resourcemanager
            Reporter: Rohith Sharma K S
            Assignee: Rohith Sharma K S


In discussion YARN-4862 
[comment|https://issues.apache.org/jira/browse/YARN-4862?focusedCommentId=15341538&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15341538],
 it is observed that there could be a container leak in NodeManager whenever 
container is preempted from RM

Basically if NM receives same containerId details in  {{containersToCleanUp}} 
and {{containersToBeRemovedFromNM}} in the same heartbeat  then container will 
never-ever removed in NMContext. Rather NM kills the container of 
containersToCleanup and send back status again to RM. But RM blindly reject the 
status since RMContainer is already removed and it is null.

I think whenever RMContainer is null, RMNode should be informed to send 
{{containersToBeRemovedFromNM}} so that NM will remove from its context.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Reply via email to