[ https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985148#comment-13985148 ]
Jian He commented on YARN-556: ------------------------------ Hi Anubhav, Looked at the prototype patch. Regarding the approach, it’s better to have a scheduler-agnostic recovery mechanism with no or minimum scheduler-specific changes, instead of implementing each scheduler specifically. YARN-1368 can be renamed to accommodate the necessary common changes for all schedulers.Also, adding cluster timestamp to the container Id doesn’t seem right and that’ll also break compatibility. > RM Restart phase 2 - Work preserving restart > -------------------------------------------- > > Key: YARN-556 > URL: https://issues.apache.org/jira/browse/YARN-556 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager > Reporter: Bikas Saha > Assignee: Bikas Saha > Attachments: Work Preserving RM Restart.pdf, > WorkPreservingRestartPrototype.001.patch > > > YARN-128 covered storing the state needed for the RM to recover critical > information. This umbrella jira will track changes needed to recover the > running state of the cluster so that work can be preserved across RM restarts. -- This message was sent by Atlassian JIRA (v6.2#6252)