[ 
https://issues.apache.org/jira/browse/YARN-5279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-5279:
------------------------------------
    Attachment: 0001-YARN-5279.patch

Updated the patch for informing RMNodeImple that untracked containers should be 
removed from corresponding NodeManager. In a given patch, I reused the event 
type {{RMNodeEventType.FINISHED_CONTAINERS_PULLED_BY_AM}} from scheduler. 

> Potential Container leak in NM in preemption flow
> -------------------------------------------------
>
>                 Key: YARN-5279
>                 URL: https://issues.apache.org/jira/browse/YARN-5279
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager, resourcemanager
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>         Attachments: 0001-YARN-5279.patch
>
>
> In discussion YARN-4862 
> [comment|https://issues.apache.org/jira/browse/YARN-4862?focusedCommentId=15341538&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15341538],
>  it is observed that there could be a container leak in NodeManager whenever 
> container is preempted from RM
> Basically if NM receives same containerId details in  {{containersToCleanUp}} 
> and {{containersToBeRemovedFromNM}} in the same heartbeat  then container 
> will never-ever removed in NMContext. Rather NM kills the container of 
> containersToCleanup and send back status again to RM. But RM blindly reject 
> the status since RMContainer is already removed and it is null.
> I think whenever RMContainer is null, RMNode should be informed to send 
> {{containersToBeRemovedFromNM}} so that NM will remove from its context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to