[
https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985148#comment-13985148
]
Jian He commented on YARN-556:
------------------------------
Hi Anubhav,
Looked at the prototype patch. Regarding the approach, it’s better to have a
scheduler-agnostic recovery mechanism with no or minimum scheduler-specific
changes, instead of implementing each scheduler specifically. YARN-1368 can be
renamed to accommodate the necessary common changes for all schedulers.Also,
adding cluster timestamp to the container Id doesn’t seem right and that’ll
also break compatibility.
> RM Restart phase 2 - Work preserving restart
> --------------------------------------------
>
> Key: YARN-556
> URL: https://issues.apache.org/jira/browse/YARN-556
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: resourcemanager
> Reporter: Bikas Saha
> Assignee: Bikas Saha
> Attachments: Work Preserving RM Restart.pdf,
> WorkPreservingRestartPrototype.001.patch
>
>
> YARN-128 covered storing the state needed for the RM to recover critical
> information. This umbrella jira will track changes needed to recover the
> running state of the cluster so that work can be preserved across RM restarts.
--
This message was sent by Atlassian JIRA
(v6.2#6252)