Benjamin Teke created YARN-11813:
------------------------------------

             Summary: YARN incorrectly falls back to cgroup v1 when cgroup v2 
has v1 named subhierarchies
                 Key: YARN-11813
                 URL: https://issues.apache.org/jira/browse/YARN-11813
             Project: Hadoop YARN
          Issue Type: Sub-task
            Reporter: Benjamin Teke
            Assignee: Benjamin Teke


YARN-11743 introduced a fallback behaviour, where if a controller is not 
mounted in v1, YARN tries to use it with v2. This is handled by the following 
init step:

{code:java}
  private static void initializeCGroupHandlers(Configuration conf,
      CGroupsHandler.CGroupController controller) throws 
ResourceHandlerException {
    initializeCGroupV1Handler(conf);
    if (cgroupsV2Enabled && !isMountedInCGroupsV1(controller)) {
      initializeCGroupV2Handler(conf);
    }
  }
{code}

There is an issue with this when we're using preconfigured mount paths 
(yarn.nodemanager.linux-container-executor.cgroups.mount-path): with 
preconfigured mount paths the /etc/mtab files are no longer checked, hence if 
there is a subhierarchy that's called the same as a v1 controller (e.g cpu, 
memory, devices, etc) YARN will think it's mounted in v1 (without checking the 
contents of the folder), and will try to update the v1 controller files on 
application launch, causing application failures.

The reason for this is the !isMountedInCGroupsV1(controller) check and the fact 
that v1 handler is initited first, and v2 is essentially used as a fallback. To 
overcome this the order should be reversed, v1 should be the fallback handler.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Reply via email to