[
https://issues.apache.org/jira/browse/YARN-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anubhav Dhoot reassigned YARN-1373:
-----------------------------------
Assignee: Anubhav Dhoot
> Transition RMApp and RMAppAttempt state to RUNNING after restart for
> recovered running apps
> -------------------------------------------------------------------------------------------
>
> Key: YARN-1373
> URL: https://issues.apache.org/jira/browse/YARN-1373
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Reporter: Bikas Saha
> Assignee: Anubhav Dhoot
>
> Currently the RM moves recovered app attempts to the a terminal recovered
> state and starts a new attempt. Instead, it will have to transition the last
> attempt to a running state such that it can proceed as normal once the
> running attempt has resynced with the ApplicationMasterService (YARN-1365 and
> YARN-1366). If the RM had started the application container before dying then
> the AM would be up and trying to contact the RM. The RM may have had died
> before launching the container. For this case, the RM should wait for AM
> liveliness period and issue a kill container for the stored master container.
> It should transition this attempt to some RECOVER_ERROR state and proceed to
> start a new attempt.
--
This message was sent by Atlassian JIRA
(v6.2#6252)