[ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370847#comment-15370847
 ] 

Shane Kumpf commented on YARN-4759:
-----------------------------------

I've started working on this again and have a patch ready based on the logic 
above.

While the patch works to properly reacquire containers on NM restart, 
exceptions occur when attempting to "docker stop" the container because 
container-executor#launch_docker_container_as_user removes the container once 
it completes (docker rm container_id). Removal of the container should be 
configurable to enable users to debug issues when a container fails to 
launch/produce the desired outcome, but changing the function signature has 
consequences elsewhere that need to be considered. Currently researching the 
options for one that will be least impactful.

> Revisit signalContainer() for docker containers
> -----------------------------------------------
>
>                 Key: YARN-4759
>                 URL: https://issues.apache.org/jira/browse/YARN-4759
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn
>            Reporter: Sidharta Seethana
>            Assignee: Shane Kumpf
>
> The current signal handling (in the DockerContainerRuntime) needs to be 
> revisited for docker containers. For example, container reacquisition on NM 
> restart might not work, depending on which user the process in the container 
> runs as. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to