[
https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arun Suresh updated YARN-5620:
------------------------------
Attachment: YARN-5620.001.patch
Attaching initial patch based on some offline ideas from [~jianhe], [~vinodkv]
etc.
I havn't included the API changes with this patch. I have just added
{{upgradeContainer}} and {{commitUpgrade}} methods to the
{{ContainerManagerImpl}} to test the end to end flow via test cases.
The patch assumes the following:
* The container is restarted only after ALL the required resources are
localized.
* If the relaunch of the container with the new bits fails, the Container will
be rollback
* Rollback involves reverting to the old launch Context and restarting.
* It is upto the AM to call the {{commitUpgrade}} once the container has
completed to ensure that if the Container fails after the upgrade, it is not
rolled back. This is required, since if the container fails for some reason
after the upgrade, there is no way to distinguish if it is because of the
upgrade or for some other reason.
> Core changes in NodeManager to support for upgrade and rollback of Containers
> -----------------------------------------------------------------------------
>
> Key: YARN-5620
> URL: https://issues.apache.org/jira/browse/YARN-5620
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Arun Suresh
> Assignee: Arun Suresh
> Attachments: YARN-5620.001.patch
>
>
> JIRA proposes to modify the ContainerManager (and other core classes) to
> support upgrade of a running container with a new {{ContainerLaunchContext}}
> as well as the ability to rollback the upgrade if the container is not able
> to restart using the new launch Context.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]