[
https://issues.apache.org/jira/browse/YARN-5366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254598#comment-16254598
]
Eric Yang commented on YARN-5366:
---------------------------------
{{--rm}} will remove container when docker is restarted. If a system admin
have to upgrade docker, and accidentally deleted end user application. This
would have severe consequences. There are gaps between YARN mode of operation,
and docker mode of operation. Let's see if we can support the additional state
in YARN application. This can help to guide us to translate the mapping
correctly to docker commands.
# Application submitted - Metadata is persisted about the existence of the
application.
# Application in queue - Application is pending for available resource.
# Application launched - Container initialized, and started.
# Application stop - Container stopped
# Application flex - Container start or container stopped is invoked.
# Application destroy - Containers removed.
The key differences between docker and YARN are YARN applications don't have
long term accumulated state. Where, docker container is likely to be reused
until it is decommissioned. For now, we have persisted yarnfile in HDFS to
represent the state and configuration of the application by using slider code.
Application flex and destroy are new operations that were introduced to mimic
docker container stateful interactions. Can we use the new flex and destroy
operation to trigger docker command to perform clean up? The answer is no
currently because YARN container ID is hardwired to Docker container name. We
are forcing docker container to work more like YARN container that it's
liveness is short lived. It will disappear as soon as job is completed, failed
or killed.
If we change reference of docker container name to application name + YARN
container ID instead of YARN container ID, this will allow us to reuse docker
container without clean up. This enables us to suspend application, and resume
later. The application destroy command can invoke {{docker rm -f}} to clean up
the occupied resource.
If we agree on mapping the gaps, we can try the following:
Container initialization:
{{docker pull}}
Application start/flex -> container start:
{{docker run or docker rename+docker start+attach}} . Run docker on the
foreground only monitor the child process liveness.
Application stop -> container stop:
{{docker stop}}
Application destroy -> container cleanup:
{{docker rm -f}}
One down side of mapping YARN to behave more like docker is the docker
container temp space may run out of space because too many suspended
application reserved the temp space.
> Improve handling of the Docker container life cycle
> ---------------------------------------------------
>
> Key: YARN-5366
> URL: https://issues.apache.org/jira/browse/YARN-5366
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: yarn
> Reporter: Shane Kumpf
> Assignee: Shane Kumpf
> Labels: oct16-medium
> Attachments: YARN-5366.001.patch, YARN-5366.002.patch,
> YARN-5366.003.patch, YARN-5366.004.patch, YARN-5366.005.patch,
> YARN-5366.006.patch
>
>
> There are several paths that need to be improved with regard to the Docker
> container lifecycle when running Docker containers on YARN.
> 1) Provide the ability to keep a container on the NodeManager for a set
> period of time for debugging purposes.
> 2) Support sending signals to the process in the container to allow for
> triggering stack traces, heap dumps, etc.
> 3) Support for Docker's live restore, which means moving away from the use of
> {{docker wait}}. (YARN-5818)
> 4) Improve the resiliency of liveliness checks (kill -0) by adding retries.
> 5) Improve the resiliency of container removal by adding retries.
> 6) Only attempt to stop, kill, and remove containers if the current container
> state allows for it.
> 7) Better handling of short lived containers when the container is stopped
> before the PID can be retrieved. (YARN-6305)
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]