[ 
https://issues.apache.org/jira/browse/YARN-128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511501#comment-13511501
 ] 

Bikas Saha commented on YARN-128:
---------------------------------

Yes we need to. This is because many things like failure tracking of AM 
attempts, job history, log and debug information are tied to attempts and so we 
cannot forget them.
Also, restarting everything is just the first step. We want to move towards a 
work-preserving restart (see doc on jira) and the current approach builds the 
ground work for it.
                
> Resurrect RM Restart 
> ---------------------
>
>                 Key: YARN-128
>                 URL: https://issues.apache.org/jira/browse/YARN-128
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.0.0-alpha
>            Reporter: Arun C Murthy
>            Assignee: Bikas Saha
>         Attachments: MR-4343.1.patch, restart-12-11-zkstore.patch, 
> restart-fs-store-11-17.patch, restart-zk-store-11-17.patch, 
> RM-recovery-initial-thoughts.txt, RMRestartPhase1.pdf, 
> YARN-128.full-code.3.patch, YARN-128.full-code-4.patch, 
> YARN-128.full-code.5.patch, YARN-128.new-code-added.3.patch, 
> YARN-128.new-code-added-4.patch, YARN-128.old-code-removed.3.patch, 
> YARN-128.old-code-removed.4.patch, YARN-128.patch
>
>
> We should resurrect 'RM Restart' which we disabled sometime during the RM 
> refactor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to