[
https://issues.apache.org/jira/browse/YARN-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rohith Sharma K S updated YARN-4862:
------------------------------------
Attachment: YARN-4862-004.patch
Updating patch handling completed container leak. The scenario is when ever RM
do not track containers, in RMNodeImpl conatainerId get added to
completedContainer list. Since this container is not tracked by RM, RM just
ignore it. This causes leak in completedContainer.
I have updated patch fixing the leak by triggering an event to RMNodeImpl. This
is basically same issue as YARN-5279. But I would prefer to add in this JIRA
itself rather than committing separately.
As part of latest patch attached, I have combined patch of YARN-5279 too. With
respect addressing comments of YARN-5279, I have not created different event
class and name as per comment. I have reused same event type
FINISHED_CONTAINERS_PULLED_BY_AM and its class
RMNodeFinishedContainersPulledByAMEvent. It is because, both event are same to
RMNodeImpl. May be I can change existing event type
FINISHED_CONTAINERS_PULLED_BY_AM to CONTAINERS_TO_BE_REMOVED_FROM_NM. Thoughts?
> Handle duplicate completed containers in RMNodeImpl
> ---------------------------------------------------
>
> Key: YARN-4862
> URL: https://issues.apache.org/jira/browse/YARN-4862
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Reporter: Rohith Sharma K S
> Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4862.patch, 0002-YARN-4862.patch,
> 0003-YARN-4862.patch, YARN-4862-004.patch
>
>
> As per
> [comment|https://issues.apache.org/jira/browse/YARN-4852?focusedCommentId=15209689&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15209689]
> from [~sharadag], there should be safe guard for duplicated container status
> in RMNodeImpl before creating UpdatedContainerInfo.
> Or else in heavily loaded cluster where event processing is gradually slow,
> if any duplicated container are sent to RM(may be bug in NM also), there is
> significant impact that RMNodImpl always create UpdatedContainerInfo for
> duplicated containers. This result in increase in the heap memory and causes
> problem like YARN-4852.
> This is an optimization for issue kind YARN-4852
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]