[
https://issues.apache.org/jira/browse/YARN-128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505615#comment-13505615
]
Arinto Murdopo commented on YARN-128:
-------------------------------------
Tested the YARN-128.full-code.5.patch, using ZooKeeper store and the result is
positive. ResourceManager resurrected properly after we killed it.
Experiment overview:
- ZK settings: 1 ZK-Server consisted of 3 different nodes
- HDFS was in single-node setting. YARN and HDFS was executed in the same node.
- Executed bbp and pi examples from the generated hadoop distribution (we built
and packaged the trunk and patch code)
- Killed ResourceManager process when bbp or pi was executing(using Linux kill
command) and started new RM 3 seconds after we killed it.
> Resurrect RM Restart
> ---------------------
>
> Key: YARN-128
> URL: https://issues.apache.org/jira/browse/YARN-128
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 2.0.0-alpha
> Reporter: Arun C Murthy
> Assignee: Bikas Saha
> Attachments: MR-4343.1.patch, restart-12-11-zkstore.patch,
> restart-fs-store-11-17.patch, restart-zk-store-11-17.patch,
> RM-recovery-initial-thoughts.txt, RMRestartPhase1.pdf,
> YARN-128.full-code.3.patch, YARN-128.full-code-4.patch,
> YARN-128.full-code.5.patch, YARN-128.new-code-added.3.patch,
> YARN-128.new-code-added-4.patch, YARN-128.old-code-removed.3.patch,
> YARN-128.old-code-removed.4.patch, YARN-128.patch
>
>
> We should resurrect 'RM Restart' which we disabled sometime during the RM
> refactor.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira