[ 
https://issues.apache.org/jira/browse/YARN-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4334:
---------------------------
    Attachment: YARN-4334.3.patch

> Ability to avoid ResourceManager recovery if state store is "too old"
> ---------------------------------------------------------------------
>
>                 Key: YARN-4334
>                 URL: https://issues.apache.org/jira/browse/YARN-4334
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Jason Lowe
>            Assignee: Chang Li
>         Attachments: YARN-4334.2.patch, YARN-4334.3.patch, YARN-4334.patch, 
> YARN-4334.wip.2.patch, YARN-4334.wip.3.patch, YARN-4334.wip.4.patch, 
> YARN-4334.wip.patch
>
>
> There are times when a ResourceManager has been down long enough that 
> ApplicationMasters and potentially external client-side monitoring mechanisms 
> have given up completely.  If the ResourceManager starts back up and tries to 
> recover we can get into situations where the RM launches new application 
> attempts for the AMs that gave up, but then the client _also_ launches 
> another instance of the app because it assumed everything was dead.
> It would be nice if the RM could be optionally configured to avoid trying to 
> recover if the state store was "too old."  The RM would come up without any 
> applications recovered, but we would avoid a double-submission situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to