[ https://issues.apache.org/jira/browse/YARN-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chang Li updated YARN-4334: --------------------------- Attachment: YARN-4334.3.patch > Ability to avoid ResourceManager recovery if state store is "too old" > --------------------------------------------------------------------- > > Key: YARN-4334 > URL: https://issues.apache.org/jira/browse/YARN-4334 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager > Reporter: Jason Lowe > Assignee: Chang Li > Attachments: YARN-4334.2.patch, YARN-4334.3.patch, YARN-4334.patch, > YARN-4334.wip.2.patch, YARN-4334.wip.3.patch, YARN-4334.wip.4.patch, > YARN-4334.wip.patch > > > There are times when a ResourceManager has been down long enough that > ApplicationMasters and potentially external client-side monitoring mechanisms > have given up completely. If the ResourceManager starts back up and tries to > recover we can get into situations where the RM launches new application > attempts for the AMs that gave up, but then the client _also_ launches > another instance of the app because it assumed everything was dead. > It would be nice if the RM could be optionally configured to avoid trying to > recover if the state store was "too old." The RM would come up without any > applications recovered, but we would avoid a double-submission situation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)