[
https://issues.apache.org/jira/browse/YARN-7973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16376998#comment-16376998
]
Shane Kumpf commented on YARN-7973:
-----------------------------------
I think we have a couple options:
# Restore the previous behavior. Remove the container prior to relaunch and
launch a new container with the same name.
# Use {{docker start}} to try to start the existing Docker container.
IMO, #2 is the more appropriate fix given the intent of {{ContainerRelaunch}}.
This has the added benefit of leaving the root filesystem in the container in
tact, which would enable the application to recover its data during relaunch.
I've started on a patch to handle this and will take ownership.
> Support ContainerRelaunch for Docker containers
> -----------------------------------------------
>
> Key: YARN-7973
> URL: https://issues.apache.org/jira/browse/YARN-7973
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Shane Kumpf
> Priority: Major
>
> Prior to YARN-5366, {{container-executor}} would remove the Docker container
> when it exited. The removal is now handled by the
> {{DockerLinuxContainerRuntime}}. {{ContainerRelaunch}} is intended to reuse
> the workdir from the previous attempt, and does not call {{cleanupContainer}}
> prior to {{launchContainer}}. The container ID is reused as well. As a
> result, the previous Docker container still exists, resulting in an error
> from Docker indicating the a container by that name already exists.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]