Jaeboo Jeong created YARN-6495:
----------------------------------

             Summary: check docker container's exit code when writing to cgroup 
task files
                 Key: YARN-6495
                 URL: https://issues.apache.org/jira/browse/YARN-6495
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: nodemanager
            Reporter: Jaeboo Jeong


If I execute simple command like date on docker container, the application 
failed to complete successfully.

for example, 
{code}
$ yarn  jar 
$HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
 -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker -shell_command "date" -jar 
$HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
 -num_containers 1 -timeout 3600000

…
17/04/12 00:16:40 INFO distributedshell.Client: Application did finished 
unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking monitoring 
loop
17/04/12 00:16:40 ERROR distributedshell.Client: Application failed to complete 
successfully
{code}

The error log is like below.
{code}
...
Failed to write pid to file /cgroup_parent/cpu/hadoop-yarn/container_xxxx/tasks 
- No such process
...
{code}

When writing pid to cgroup tasks, container-executor doesn’t check docker 
container’s status.
If the container finished very quickly, we can’t write pid to cgroup tasks, and 
it is not problem.
So container-executor needs to check docker container’s exit code during 
writing pid to cgroup tasks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to