[ 
https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16619730#comment-16619730
 ] 

Hudson commented on YARN-8648:
------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14997 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14997/])
YARN-8648. Container cgroups are leaked when using docker. Contributed (jlowe: 
rev 2df0a8dcb3dfde15d216481cc1296d97d2cb5d43)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.h
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test-container-executor.c
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/docker/TestDockerCommandExecutor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DockerLinuxContainerRuntime.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/docker/DockerRmCommand.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/docker/TestDockerRmCommand.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/ResourceHandlerModule.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c


> Container cgroups are leaked when using docker
> ----------------------------------------------
>
>                 Key: YARN-8648
>                 URL: https://issues.apache.org/jira/browse/YARN-8648
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Jim Brennan
>            Assignee: Jim Brennan
>            Priority: Major
>              Labels: Docker
>             Fix For: 3.2.0, 3.1.2
>
>         Attachments: YARN-8648.001.patch, YARN-8648.002.patch, 
> YARN-8648.003.patch, YARN-8648.004.patch, YARN-8648.005.patch, 
> YARN-8648.006.patch
>
>
> When you run with docker and enable cgroups for cpu, docker creates cgroups 
> for all resources on the system, not just for cpu.  For instance, if the 
> {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, 
> the nodemanager will create a cgroup for each container under 
> {{/sys/fs/cgroup/cpu/hadoop-yarn}}.  In the docker case, we pass this path 
> via the {{--cgroup-parent}} command line argument.   Docker then creates a 
> cgroup for the docker container under that, for instance: 
> {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}.
> When the container exits, docker cleans up the {{docker_container_id}} 
> cgroup, and the nodemanager cleans up the {{container_id}} cgroup,   All is 
> good under {{/sys/fs/cgroup/hadoop-yarn}}.
> The problem is that docker also creates that same hierarchy under every 
> resource under {{/sys/fs/cgroup}}.  On the rhel7 system I am using, these 
> are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, 
> perf_event, and systemd.    So for instance, docker creates 
> {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but 
> it only cleans up the leaf cgroup {{docker_container_id}}.  Nobody cleans up 
> the {{container_id}} cgroups for these other resources.  On one of our busy 
> clusters, we found > 100,000 of these leaked cgroups.
> I found this in our 2.8-based version of hadoop, but I have been able to 
> repro with current hadoop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to