Jaeboo Jeong created YARN-6495:
----------------------------------
Summary: check docker container's exit code when writing to cgroup
task files
Key: YARN-6495
URL: https://issues.apache.org/jira/browse/YARN-6495
Project: Hadoop YARN
Issue Type: Improvement
Components: nodemanager
Reporter: Jaeboo Jeong
If I execute simple command like date on docker container, the application
failed to complete successfully.
for example,
{code}
$ yarn jar
$HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
-shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env
YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker -shell_command "date" -jar
$HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
-num_containers 1 -timeout 3600000
…
17/04/12 00:16:40 INFO distributedshell.Client: Application did finished
unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking monitoring
loop
17/04/12 00:16:40 ERROR distributedshell.Client: Application failed to complete
successfully
{code}
The error log is like below.
{code}
...
Failed to write pid to file /cgroup_parent/cpu/hadoop-yarn/container_xxxx/tasks
- No such process
...
{code}
When writing pid to cgroup tasks, container-executor doesn’t check docker
container’s status.
If the container finished very quickly, we can’t write pid to cgroup tasks, and
it is not problem.
So container-executor needs to check docker container’s exit code during
writing pid to cgroup tasks.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]