[
https://issues.apache.org/jira/browse/YARN-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472213#comment-16472213
]
Jason Lowe commented on YARN-8274:
----------------------------------
I think I know what the issue is, sorry I missed it in my review of YARN-8207.
I think the problem is with this code:
{code}
ret = add_to_args(args, "--pid='host'");
{code}
Now that we aren't calling popen and being parsed by a shell, the quotes are
literally being passed to docker. Docker of course won't know what to do with
those quotes, so it just looks like a bad value to the --pid argument.
I'll put up a patch shortly.
> Docker command error during container relaunch
> ----------------------------------------------
>
> Key: YARN-8274
> URL: https://issues.apache.org/jira/browse/YARN-8274
> Project: Hadoop YARN
> Issue Type: Task
> Reporter: Billie Rinaldi
> Priority: Critical
>
> I initiated container relaunch with a "sleep 60; exit 1" launch command and
> saw a "not a docker command" error on relaunch. Haven't figured out why this
> is happening, but it seems like it has been introduced recently to
> trunk/branch-3.1. cc [[email protected]] [~ebadger]
> {noformat}
> org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException:
> Relaunch container failed
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.relaunchContainer(DockerLinuxContainerRuntime.java:954)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.relaunchContainer(DelegatingLinuxContainerRuntime.java:150)
> at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.handleLaunchForLaunchType(LinuxContainerExecutor.java:562)
> at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.relaunchContainer(LinuxContainerExecutor.java:486)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.relaunchContainer(ContainerLaunch.java:504)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerRelaunch.call(ContainerRelaunch.java:111)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerRelaunch.call(ContainerRelaunch.java:47)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2018-05-09 21:41:46,631 INFO
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception from
> container-launch.
> 2018-05-09 21:41:46,631 INFO
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Container id:
> container_1525897486447_0003_01_000002
> 2018-05-09 21:41:46,631 INFO
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exit code: 7
> 2018-05-09 21:41:46,631 INFO
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception
> message: Relaunch container failed
> 2018-05-09 21:41:46,631 INFO
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Shell error
> output: docker: 'container_1525897486447_0003_01_000002' is not a docker
> command.
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]