[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341427#comment-14341427
 ] 

Chun Chen commented on YARN-3080:
---------------------------------

[~ashahab], I think we can simply fix this by using the pid of the session 
script bash process instead since docker run will block until it exits. If 
docker container exits, the session script bash process will exit immediately. 
As for signalContainer, we can use docker kill --signal="SIGNAL" containerId 
instead.

> The DockerContainerExecutor could not write the right pid to container pidFile
> ------------------------------------------------------------------------------
>
>                 Key: YARN-3080
>                 URL: https://issues.apache.org/jira/browse/YARN-3080
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.6.0
>            Reporter: Beckham007
>            Assignee: Abin Shahab
>         Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
> YARN-3080.patch
>
>
> The docker_container_executor_session.sh is like this:
> {quote}
> #!/usr/bin/env bash
> echo `/usr/bin/docker inspect --format {{.State.Pid}} 
> container_1421723685222_0008_01_000002` > 
> /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_000002/container_1421723685222_0008_01_000002.pid.tmp
> /bin/mv -f 
> /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_000002/container_1421723685222_0008_01_000002.pid.tmp
>  
> /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_000002/container_1421723685222_0008_01_000002.pid
> /usr/bin/docker run --rm  --name container_1421723685222_0008_01_000002 -e 
> GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
> GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
> GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
> GAIA_CONTAINER_ID=container_1421723685222_0008_01_000002 --memory=32M 
> --cpu-shares=1024 -v 
> /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_000002:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_000002
>  -v 
> /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_000002:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_000002
>  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
> "/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_000002/launch_container.sh"
> {quote}
> The DockerContainerExecutor use docker inspect before docker run, so the 
> docker inspect couldn't get the right pid for the docker, signalContainer() 
> and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to