[
https://issues.apache.org/jira/browse/YARN-5197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated YARN-5197:
-----------------------------
Attachment: YARN-5197.003.patch
Thanks for the review, Rohith! I updated the patch to add the GUARANTEED check
in findLostContainers.
> RM leaks containers if running container disappears from node update
> --------------------------------------------------------------------
>
> Key: YARN-5197
> URL: https://issues.apache.org/jira/browse/YARN-5197
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 2.7.2, 2.6.4
> Reporter: Jason Lowe
> Assignee: Jason Lowe
> Attachments: YARN-5197.001.patch, YARN-5197.002.patch,
> YARN-5197.003.patch
>
>
> Once a node reports a container running in a status update, the corresponding
> RMNodeImpl will track the container in its launchedContainers map. If the
> node somehow misses sending the completed container status to the RM and the
> container simply disappears from subsequent heartbeats, the container will
> leak in launchedContainers forever and the container completion event will
> not be sent to the scheduler.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]