[
https://issues.apache.org/jira/browse/YARN-128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511501#comment-13511501
]
Bikas Saha commented on YARN-128:
---------------------------------
Yes we need to. This is because many things like failure tracking of AM
attempts, job history, log and debug information are tied to attempts and so we
cannot forget them.
Also, restarting everything is just the first step. We want to move towards a
work-preserving restart (see doc on jira) and the current approach builds the
ground work for it.
> Resurrect RM Restart
> ---------------------
>
> Key: YARN-128
> URL: https://issues.apache.org/jira/browse/YARN-128
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 2.0.0-alpha
> Reporter: Arun C Murthy
> Assignee: Bikas Saha
> Attachments: MR-4343.1.patch, restart-12-11-zkstore.patch,
> restart-fs-store-11-17.patch, restart-zk-store-11-17.patch,
> RM-recovery-initial-thoughts.txt, RMRestartPhase1.pdf,
> YARN-128.full-code.3.patch, YARN-128.full-code-4.patch,
> YARN-128.full-code.5.patch, YARN-128.new-code-added.3.patch,
> YARN-128.new-code-added-4.patch, YARN-128.old-code-removed.3.patch,
> YARN-128.old-code-removed.4.patch, YARN-128.patch
>
>
> We should resurrect 'RM Restart' which we disabled sometime during the RM
> refactor.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira