[ https://issues.apache.org/jira/browse/YARN-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645136#comment-14645136 ]
sandflee commented on YARN-3987: -------------------------------- yes, we set getKeepContainersAcrossApplicationAttempts true, thanks for your review. > am container complete msg ack to NM once RM receive it > ------------------------------------------------------ > > Key: YARN-3987 > URL: https://issues.apache.org/jira/browse/YARN-3987 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Reporter: sandflee > Assignee: sandflee > Attachments: YARN-3987.001.patch > > > In our cluster we set max-am-attempts to a very very large num, and > unfortunately our am crash after launched, leaving too many completed > container(AM container) in NM. completed container is removed from NM and > NMStateStore only if container complete is passed to AM, but if AM couldn't > be launched, the completed AM container couldn't be cleaned, and may eat up > NM heap memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)