[ https://issues.apache.org/jira/browse/YARN-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883323#comment-15883323 ]
Jian He commented on YARN-6153: ------------------------------- [~kyungwan nam], thanks for updating, patch looks good to me overall, I found there are several places in RMAppAttemptImpl where it uses below way to retrieve its RMApp, {code} appAttempt.rmContext.getRMApps().get( appAttempt.getAppAttemptId().getApplicationId() {code} I think we can change the RMAppAttemptImpl constructor to take RMApp as one parameter so that we won't need the hashmap to back trace its RMApp, would you like to make the change ? > keepContainer does not work when AM retry window is set > ------------------------------------------------------- > > Key: YARN-6153 > URL: https://issues.apache.org/jira/browse/YARN-6153 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Affects Versions: 2.7.1 > Reporter: kyungwan nam > Attachments: YARN-6153.001.patch, YARN-6153.002.patch, > YARN-6153.003.patch, YARN-6153.004.patch, YARN-6153.005.patch > > > yarn.resourcemanager.am.max-attempts has been configured to 2 in my cluster. > I submitted a YARN application (slider app) that keepContainers=true, > attemptFailuresValidityInterval=300000. > it did work properly when AM was failed firstly. > all containers launched by previous AM were resynced with new AM (attempt2) > without killing containers. > after 10 minutes, I thought AM failure count was reset by > attemptFailuresValidityInterval (5 minutes). > but, all containers were killed when AM was failed secondly. (new AM attempt3 > was launched properly) -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org