[
https://issues.apache.org/jira/browse/YARN-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662192#comment-14662192
]
Anubhav Dhoot commented on YARN-4030:
-------------------------------------
We can add an option that makes NM read their own cgroup subpath by looking at
/proc/self/cgroup. This would give its own cgroup subpath say
/docker/<dockerid1>/nmcgroup in the case described above. And we use that in
combination with the controller path that it already reads.
Specifically the cGroupPrefix in CGroupsHandlerImpl and
CgroupsLCEResourcesHandler would incorporate this subpath.
Since Containers are longer lived than NM itself we should probably make the
cGroupPrefix be a sibling of NM. So in this case make cGroupPrefix
/docker/<dockerid1>/hadoop-yarn which will give an effective path for cgroups
for containers to be
/sys/fs/cgroup/cpu/docker/<dockerid1>/hadoop-yarn/container1 and so on.
> Make Nodemanager cgroup usage for container easier to use when its running
> inside a cgroup
> -------------------------------------------------------------------------------------------
>
> Key: YARN-4030
> URL: https://issues.apache.org/jira/browse/YARN-4030
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: nodemanager
> Reporter: Anubhav Dhoot
> Assignee: Anubhav Dhoot
>
> Today nodemanager uses the cgroup prefix pointed by
> yarn.nodemanager.linux-container-executor.cgroups.hierarchy (default value
> /hadoop-yarn) directly at the path of the controller say
> /sys/fs/cgroup/cpu/hadoop-yarn.
> If there are nodemanagers running inside docker containers on a host, each
> would typically be separated by a cgroup under the controller path say
> /sys/fs/cgroup/cpu/docker/<dockerid1>/nmcgroup for NM1 and
> /sys/fs/cgroup/cpu/docker/<dockerid2>/nmcgroup for NM2.
> In this case the correct behavior should be to use the docker cgroup paths as
> /sys/fs/cgroup/cpu/docker/<dockerid1>/hadoop-yarn for NM1
> /sys/fs/cgroup/cpu/docker/<dockerid2>/hadoop-yarn for NM2.
> But the default behavior would make both NMs try to use
> /sys/fs/cgroup/cpu/hadoop-yarn which is incorrect and would usually fail
> based on the permissions setup.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)