[
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241924#comment-15241924
]
Shane Kumpf commented on YARN-4759:
-----------------------------------
After considering the options for ensuring graceful stop of processes that
require special signal handling, I believe this shouldn't be left up to
signalContainer or even YARN. Users with containers that have special signal
handling needs should understand how docker manages signaling on docker stop.
It is possible to specify the signal via docker run or the Dockerfile, and
users should do so if they require it.
Given the above, the approach I've taken is as follows:
1) For container liveliness checks using the null signal, run kill -0 on the
container's PID 1 from the host. We already get the appropriate PID via docker
inspect in container-executor. This is the same as how
DefaultLinuxContainerRuntime handles liveliness checks.
2) For any other signal, call docker stop on the docker container.
If a network container is requested, docker stop will be called on it as well.
I'm working on a patch that does the above.
> Revisit signalContainer() for docker containers
> -----------------------------------------------
>
> Key: YARN-4759
> URL: https://issues.apache.org/jira/browse/YARN-4759
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: yarn
> Reporter: Sidharta Seethana
> Assignee: Shane Kumpf
>
> The current signal handling (in the DockerContainerRuntime) needs to be
> revisited for docker containers. For example, container reacquisition on NM
> restart might not work, depending on which user the process in the container
> runs as.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)