[
https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16586060#comment-16586060
]
Jim Brennan commented on YARN-8648:
-----------------------------------
Thanks [~eyang]! My main concern about the minimal fix is the security aspect,
since we will need to add an option to container-executor to tell it to delete
all cgroups with a particular name as root (since docker will create them as
root).
I think this is mitigated if we use the "cgroup" section of the
container-exectutor.cfg to constrain it. This is currently used to enable
updating params, but I think it could be used for this as well. It already
defines the CGROUPS_ROOT (e.g., /sys/fs/cgroup), and the YARN_HIERARCHY (e.g,
hadoop-yarn). We could either add another config parameter to define the list
of hierarchies to clean up (e.g, cpuset, freezer, hugetlb, etc...), or we can
parse /proc/mounts to determine the full list. I think it's safer to add the
config parameter.
I will start working on this version unless there are objections?
> Container cgroups are leaked when using docker
> ----------------------------------------------
>
> Key: YARN-8648
> URL: https://issues.apache.org/jira/browse/YARN-8648
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Jim Brennan
> Assignee: Jim Brennan
> Priority: Major
> Labels: Docker
>
> When you run with docker and enable cgroups for cpu, docker creates cgroups
> for all resources on the system, not just for cpu. For instance, if the
> {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}},
> the nodemanager will create a cgroup for each container under
> {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path
> via the {{--cgroup-parent}} command line argument. Docker then creates a
> cgroup for the docker container under that, for instance:
> {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}.
> When the container exits, docker cleans up the {{docker_container_id}}
> cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is
> good under {{/sys/fs/cgroup/hadoop-yarn}}.
> The problem is that docker also creates that same hierarchy under every
> resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these
> are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio,
> perf_event, and systemd. So for instance, docker creates
> {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but
> it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up
> the {{container_id}} cgroups for these other resources. On one of our busy
> clusters, we found > 100,000 of these leaked cgroups.
> I found this in our 2.8-based version of hadoop, but I have been able to
> repro with current hadoop.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]