Yesha Vora created YARN-8231:
--------------------------------

             Summary: Dshell application fails when one of the docker container 
gets killed
                 Key: YARN-8231
                 URL: https://issues.apache.org/jira/browse/YARN-8231
             Project: Hadoop YARN
          Issue Type: Bug
          Components: yarn-native-services
            Reporter: Yesha Vora


1) Launch dshell application
{code}
yarn  jar hadoop-yarn-applications-distributedshell-*.jar  -shell_command 
"sleep 300" -num_containers 2 -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker 
-shell_env YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=centos/httpd-24-centos7:latest 
-keep_containers_across_application_attempts -jar 
/usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell-*.jar{code}
2) Kill container_1524681858728_0012_01_000002

Expected behavior:
Application should start new instance and finish successfully

Actual behavior:
Application Failed as soon as container was killed

{code:title=AM log}
18/04/27 23:05:12 INFO distributedshell.ApplicationMaster: Got response from RM 
for container ask, completedCnt=1
18/04/27 23:05:12 INFO distributedshell.ApplicationMaster: 
appattempt_1524681858728_0012_000001 got container status for 
containerID=container_1524681858728_0012_01_000002, state=COMPLETE, 
exitStatus=137, diagnostics=[2018-04-27 23:05:09.310]Container killed on 
request. Exit code is 137
[2018-04-27 23:05:09.331]Container exited with a non-zero exit code 137. 
[2018-04-27 23:05:09.332]Killed by external signal

18/04/27 23:08:46 INFO distributedshell.ApplicationMaster: Got response from RM 
for container ask, completedCnt=1
18/04/27 23:08:46 INFO distributedshell.ApplicationMaster: 
appattempt_1524681858728_0012_000001 got container status for 
containerID=container_1524681858728_0012_01_000003, state=COMPLETE, 
exitStatus=0, diagnostics=
18/04/27 23:08:46 INFO distributedshell.ApplicationMaster: Container completed 
successfully., containerId=container_1524681858728_0012_01_000003
18/04/27 23:08:46 INFO distributedshell.ApplicationMaster: Application 
completed. Stopping running containers
18/04/27 23:08:46 INFO distributedshell.ApplicationMaster: Application 
completed. Signalling finish to RM
18/04/27 23:08:46 INFO distributedshell.ApplicationMaster: Diagnostics., 
total=2, completed=2, allocated=2, failed=1
18/04/27 23:08:46 INFO impl.AMRMClientImpl: Waiting for application to be 
successfully unregistered.{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Reply via email to