[ 
https://issues.apache.org/jira/browse/YARN-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472777#comment-16472777
 ] 

Jason Lowe commented on YARN-8274:
----------------------------------

bq.  It would be nice if the code was refactored to add docker_binary in 
construct_docker_command to avoid duplicated add_to_args for docker_binary for 
all get_docker_*_command, but the priority is to get a good stable state for 
release.

I was thinking the exact same thing as I was writing the patch.  I went for the 
simple approach to keep the patch small and easy to review since it's a bugfix. 
 I filed YARN-8284 to track that.

> Docker command error during container relaunch
> ----------------------------------------------
>
>                 Key: YARN-8274
>                 URL: https://issues.apache.org/jira/browse/YARN-8274
>             Project: Hadoop YARN
>          Issue Type: Task
>            Reporter: Billie Rinaldi
>            Assignee: Jason Lowe
>            Priority: Critical
>             Fix For: 3.2.0, 3.1.1
>
>         Attachments: YARN-8274.001.patch, YARN-8274.002.patch
>
>
> I initiated container relaunch with a "sleep 60; exit 1" launch command and 
> saw a "not a docker command" error on relaunch. Haven't figured out why this 
> is happening, but it seems like it has been introduced recently to 
> trunk/branch-3.1. cc [~shaneku...@gmail.com] [~ebadger]
> {noformat}
> org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException:
>  Relaunch container failed
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.relaunchContainer(DockerLinuxContainerRuntime.java:954)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.relaunchContainer(DelegatingLinuxContainerRuntime.java:150)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.handleLaunchForLaunchType(LinuxContainerExecutor.java:562)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.relaunchContainer(LinuxContainerExecutor.java:486)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.relaunchContainer(ContainerLaunch.java:504)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerRelaunch.call(ContainerRelaunch.java:111)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerRelaunch.call(ContainerRelaunch.java:47)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> 2018-05-09 21:41:46,631 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception from 
> container-launch.
> 2018-05-09 21:41:46,631 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Container id: 
> container_1525897486447_0003_01_000002
> 2018-05-09 21:41:46,631 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exit code: 7
> 2018-05-09 21:41:46,631 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception 
> message: Relaunch container failed
> 2018-05-09 21:41:46,631 INFO 
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Shell error 
> output: docker: 'container_1525897486447_0003_01_000002' is not a docker 
> command.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to