Benjamin Teke created YARN-11813: ------------------------------------ Summary: YARN incorrectly falls back to cgroup v1 when cgroup v2 has v1 named subhierarchies Key: YARN-11813 URL: https://issues.apache.org/jira/browse/YARN-11813 Project: Hadoop YARN Issue Type: Sub-task Reporter: Benjamin Teke Assignee: Benjamin Teke
YARN-11743 introduced a fallback behaviour, where if a controller is not mounted in v1, YARN tries to use it with v2. This is handled by the following init step: {code:java} private static void initializeCGroupHandlers(Configuration conf, CGroupsHandler.CGroupController controller) throws ResourceHandlerException { initializeCGroupV1Handler(conf); if (cgroupsV2Enabled && !isMountedInCGroupsV1(controller)) { initializeCGroupV2Handler(conf); } } {code} There is an issue with this when we're using preconfigured mount paths (yarn.nodemanager.linux-container-executor.cgroups.mount-path): with preconfigured mount paths the /etc/mtab files are no longer checked, hence if there is a subhierarchy that's called the same as a v1 controller (e.g cpu, memory, devices, etc) YARN will think it's mounted in v1 (without checking the contents of the folder), and will try to update the v1 controller files on application launch, causing application failures. The reason for this is the !isMountedInCGroupsV1(controller) check and the fact that v1 handler is initited first, and v2 is essentially used as a fallback. To overcome this the order should be reversed, v1 should be the fallback handler. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org