[ https://issues.apache.org/jira/browse/YARN-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848237#comment-17848237 ]
ASF GitHub Bot commented on YARN-11687: --------------------------------------- brumi1024 commented on code in PR #6835: URL: https://github.com/apache/hadoop/pull/6835#discussion_r1608437832 ########## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsV2ResourceCalculator.java: ########## @@ -0,0 +1,124 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources; + +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; +import java.util.stream.Stream; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.lang3.StringUtils; + +/** + * A CGroupV2 file-system based Resource calculator without the process tree features. + * + * Warning!!! + * ResourceCalculatorProcessTree can be used with mapreduce.job.process-tree.class property. + * However, those instances runs in the mapreduce task, and can not access to the + * ResourceHandlerModule, what is only initialised in the NodeManager process not in the container. + * So this implementation will not work with the mapreduce.job.process-tree.class property. + * + * Limitation: CGroups does not have the ability to measure virtual memory usage. + * This includes memory reserved but not used. + * CGroups measures used memory as sa sum of physical memory and swap usage. Review Comment: ```suggestion * Cgroup measures used memory as a sum of the physical memory and swap usage. ``` ########## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsResourceCalculator.java: ########## @@ -18,338 +18,134 @@ package org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources; -import org.apache.hadoop.classification.VisibleForTesting; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; -import org.apache.hadoop.util.CpuTimeTracker; -import org.apache.hadoop.util.Shell; -import org.apache.hadoop.util.SysInfoLinux; -import org.apache.hadoop.yarn.exceptions.YarnException; -import org.apache.hadoop.yarn.util.Clock; -import org.apache.hadoop.yarn.util.ResourceCalculatorProcessTree; -import org.apache.hadoop.yarn.util.SystemClock; - -import java.io.BufferedReader; import java.io.File; -import java.io.FileInputStream; import java.io.IOException; -import java.io.InputStreamReader; -import java.math.BigInteger; -import java.nio.charset.StandardCharsets; -import java.util.function.Function; -import java.util.regex.Matcher; -import java.util.regex.Pattern; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; /** - * A cgroups file-system based Resource calculator without the process tree - * features. + * A CGroupV1 file-system based Resource calculator without the process tree features. * - * CGroups has its limitations. It can only be enabled, if both CPU and memory - * cgroups are enabled with yarn.nodemanager.resource.cpu.enabled and - * yarn.nodemanager.resource.memory.enabled respectively. This means that - * memory limits are enforced by default. You can turn this off and keep - * memory reporting only with yarn.nodemanager.resource.memory.enforced. + * Warning!!! + * ResourceCalculatorProcessTree can be used with mapreduce.job.process-tree.class property. + * However, those instances runs in the mapreduce task, and can not access to the + * ResourceHandlerModule, what is only initialised in the NodeManager process not in the container. + * So this implementation will not work with the mapreduce.job.process-tree.class property. Review Comment: This description is a bit complicated, can you please rephrase it? If I understand it correctly the usage of ResourceCalculatorProcessTree can be enforced with the mapreduce.job.process-tree.class property, but the MR task JVM process doesn't have an initialised ResourceHandlerModule, hence issues will occur. What would be the symptom, and shouldn't we log a warning about it somewhere when we run into this? ########## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsV2ResourceCalculator.java: ########## @@ -0,0 +1,124 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources; + +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; +import java.util.stream.Stream; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.lang3.StringUtils; + +/** + * A CGroupV2 file-system based Resource calculator without the process tree features. Review Comment: ```suggestion * A Cgroup version 2 file-system based Resource calculator without the process tree features. ``` ########## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsV2ResourceCalculator.java: ########## @@ -0,0 +1,124 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources; + +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; +import java.util.stream.Stream; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.lang3.StringUtils; + +/** + * A CGroupV2 file-system based Resource calculator without the process tree features. + * + * Warning!!! + * ResourceCalculatorProcessTree can be used with mapreduce.job.process-tree.class property. + * However, those instances runs in the mapreduce task, and can not access to the + * ResourceHandlerModule, what is only initialised in the NodeManager process not in the container. + * So this implementation will not work with the mapreduce.job.process-tree.class property. Review Comment: Same here as above. ########## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsResourceCalculator.java: ########## @@ -18,338 +18,134 @@ package org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources; -import org.apache.hadoop.classification.VisibleForTesting; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; -import org.apache.hadoop.util.CpuTimeTracker; -import org.apache.hadoop.util.Shell; -import org.apache.hadoop.util.SysInfoLinux; -import org.apache.hadoop.yarn.exceptions.YarnException; -import org.apache.hadoop.yarn.util.Clock; -import org.apache.hadoop.yarn.util.ResourceCalculatorProcessTree; -import org.apache.hadoop.yarn.util.SystemClock; - -import java.io.BufferedReader; import java.io.File; -import java.io.FileInputStream; import java.io.IOException; -import java.io.InputStreamReader; -import java.math.BigInteger; -import java.nio.charset.StandardCharsets; -import java.util.function.Function; -import java.util.regex.Matcher; -import java.util.regex.Pattern; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; /** - * A cgroups file-system based Resource calculator without the process tree - * features. + * A CGroupV1 file-system based Resource calculator without the process tree features. * - * CGroups has its limitations. It can only be enabled, if both CPU and memory - * cgroups are enabled with yarn.nodemanager.resource.cpu.enabled and - * yarn.nodemanager.resource.memory.enabled respectively. This means that - * memory limits are enforced by default. You can turn this off and keep - * memory reporting only with yarn.nodemanager.resource.memory.enforced. + * Warning!!! + * ResourceCalculatorProcessTree can be used with mapreduce.job.process-tree.class property. + * However, those instances runs in the mapreduce task, and can not access to the + * ResourceHandlerModule, what is only initialised in the NodeManager process not in the container. + * So this implementation will not work with the mapreduce.job.process-tree.class property. * - * Another limitation is virtual memory measurement. CGroups does not have the - * ability to measure virtual memory usage. This includes memory reserved but - * not used. CGroups measures used memory as sa sum of - * physical memory and swap usage. This will be returned in the virtual - * memory counters. + * Limitation: CGroups does not have the ability to measure virtual memory usage. + * This includes memory reserved but not used. + * CGroups measures used memory as sa sum of physical memory and swap usage. Review Comment: ```suggestion * Cgroup measures used memory as a sum of the physical memory and swap usage. ``` ########## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsV2ResourceCalculator.java: ########## @@ -0,0 +1,124 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources; + +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; +import java.util.stream.Stream; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.lang3.StringUtils; + +/** + * A CGroupV2 file-system based Resource calculator without the process tree features. + * + * Warning!!! + * ResourceCalculatorProcessTree can be used with mapreduce.job.process-tree.class property. + * However, those instances runs in the mapreduce task, and can not access to the + * ResourceHandlerModule, what is only initialised in the NodeManager process not in the container. + * So this implementation will not work with the mapreduce.job.process-tree.class property. + * + * Limitation: CGroups does not have the ability to measure virtual memory usage. Review Comment: ```suggestion * Limitation: Cgroup does not have the ability to measure virtual memory usage. ``` ########## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsResourceCalculator.java: ########## @@ -18,338 +18,134 @@ package org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources; -import org.apache.hadoop.classification.VisibleForTesting; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; -import org.apache.hadoop.util.CpuTimeTracker; -import org.apache.hadoop.util.Shell; -import org.apache.hadoop.util.SysInfoLinux; -import org.apache.hadoop.yarn.exceptions.YarnException; -import org.apache.hadoop.yarn.util.Clock; -import org.apache.hadoop.yarn.util.ResourceCalculatorProcessTree; -import org.apache.hadoop.yarn.util.SystemClock; - -import java.io.BufferedReader; import java.io.File; -import java.io.FileInputStream; import java.io.IOException; -import java.io.InputStreamReader; -import java.math.BigInteger; -import java.nio.charset.StandardCharsets; -import java.util.function.Function; -import java.util.regex.Matcher; -import java.util.regex.Pattern; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; /** - * A cgroups file-system based Resource calculator without the process tree - * features. + * A CGroupV1 file-system based Resource calculator without the process tree features. Review Comment: ```suggestion * A Cgroup version 1 file-system based Resource calculator without the process tree features. ``` ########## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsV2ResourceCalculator.java: ########## @@ -0,0 +1,124 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources; + +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; +import java.util.stream.Stream; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.lang3.StringUtils; + +/** + * A CGroupV2 file-system based Resource calculator without the process tree features. + * + * Warning!!! + * ResourceCalculatorProcessTree can be used with mapreduce.job.process-tree.class property. + * However, those instances runs in the mapreduce task, and can not access to the + * ResourceHandlerModule, what is only initialised in the NodeManager process not in the container. + * So this implementation will not work with the mapreduce.job.process-tree.class property. + * + * Limitation: CGroups does not have the ability to measure virtual memory usage. + * This includes memory reserved but not used. + * CGroups measures used memory as sa sum of physical memory and swap usage. + * This will be returned to the virtual memory counters. + * If the real virtual memory is required please use the legacy procfs based + * resource calculator or CombinedResourceCalculator. + */ +public class CGroupsV2ResourceCalculator extends AbstractCGroupsResourceCalculator { + private static final Logger LOG = LoggerFactory.getLogger(CGroupsV2ResourceCalculator.class); + + /** + * <a href="https://docs.kernel.org/admin-guide/cgroup-v2.html#cpu-interface-files">Documentation</a> + * + * ... + * cpu.stat + * A read-only flat-keyed file. This file exists whether the controller is enabled or not. + * It always reports the following three stats: + * - usage_usec + * - user_usec + * - system_usec + * ... + * + */ + private static final String CPU_STAT = "cpu.stat#usage_usec"; + + /** + * <a href="https://docs.kernel.org/admin-guide/cgroup-v2.html#memory-interface-files">Documentation</a> + * + * ... + * memory.stat + * A read-only flat-keyed file which exists on non-root cgroups. + * This breaks down the cgroup’s memory footprint into different types of memory, type-specific details, and other information on the state and past events of the memory management system. + * All memory amounts are in bytes. + * ... + * anon + * Amount of memory used in anonymous mappings such as brk(), sbrk(), and mmap(MAP_ANONYMOUS) + * ... + * + */ + private static final String MEM_STAT = "memory.stat#anon"; + + /** + * <a href="https://docs.kernel.org/admin-guide/cgroup-v2.html#memory-interface-files">Documentation</a> + * + * ... + * memory.swap.current + * A read-only single value file which exists on non-root cgroups. + * The total amount of swap currently being used by the cgroup and its descendants. + * ... + * + */ + private static final String MEMSW_STAT = "memory.swap.current"; + + public CGroupsV2ResourceCalculator(String pid) { + super( + pid, + Collections.singletonList(CPU_STAT), + MEM_STAT, + MEMSW_STAT + ); + } + + @Override + protected List<Path> getCgroupFilesToLoadInStats() { Review Comment: Nit: CGroup is not the official terminology (should be cgroup), but the rest of the code uses CGroup ```suggestion protected List<Path> getCGroupFilesToLoadInStats() { ``` ########## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsResourceCalculator.java: ########## @@ -18,338 +18,134 @@ package org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources; -import org.apache.hadoop.classification.VisibleForTesting; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; -import org.apache.hadoop.util.CpuTimeTracker; -import org.apache.hadoop.util.Shell; -import org.apache.hadoop.util.SysInfoLinux; -import org.apache.hadoop.yarn.exceptions.YarnException; -import org.apache.hadoop.yarn.util.Clock; -import org.apache.hadoop.yarn.util.ResourceCalculatorProcessTree; -import org.apache.hadoop.yarn.util.SystemClock; - -import java.io.BufferedReader; import java.io.File; -import java.io.FileInputStream; import java.io.IOException; -import java.io.InputStreamReader; -import java.math.BigInteger; -import java.nio.charset.StandardCharsets; -import java.util.function.Function; -import java.util.regex.Matcher; -import java.util.regex.Pattern; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; /** - * A cgroups file-system based Resource calculator without the process tree - * features. + * A CGroupV1 file-system based Resource calculator without the process tree features. * - * CGroups has its limitations. It can only be enabled, if both CPU and memory - * cgroups are enabled with yarn.nodemanager.resource.cpu.enabled and - * yarn.nodemanager.resource.memory.enabled respectively. This means that - * memory limits are enforced by default. You can turn this off and keep - * memory reporting only with yarn.nodemanager.resource.memory.enforced. + * Warning!!! + * ResourceCalculatorProcessTree can be used with mapreduce.job.process-tree.class property. + * However, those instances runs in the mapreduce task, and can not access to the + * ResourceHandlerModule, what is only initialised in the NodeManager process not in the container. + * So this implementation will not work with the mapreduce.job.process-tree.class property. * - * Another limitation is virtual memory measurement. CGroups does not have the - * ability to measure virtual memory usage. This includes memory reserved but - * not used. CGroups measures used memory as sa sum of - * physical memory and swap usage. This will be returned in the virtual - * memory counters. + * Limitation: CGroups does not have the ability to measure virtual memory usage. Review Comment: ```suggestion * Limitation: Cgroup does not have the ability to measure virtual memory usage. ``` > Update CGroupsResourceCalculator to track usages using cgroupv2 > --------------------------------------------------------------- > > Key: YARN-11687 > URL: https://issues.apache.org/jira/browse/YARN-11687 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Benjamin Teke > Assignee: Bence Kosztolnik > Priority: Major > Labels: pull-request-available > > [CGroupsResourceCalculator|https://github.com/apache/hadoop/blob/f609460bda0c2bd87dd3580158e549e2f34f14d5/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsResourceCalculator.java] > should also be updated to handle the cgroup v2 changes. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org