[
https://issues.apache.org/jira/browse/YARN-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762542#comment-16762542
]
Szilard Nemeth commented on YARN-9098:
--------------------------------------
Hi [~Jim_Brennan]!
Thanks for pointing out this tricky "bug"!
However, I think my code works exactly as before the change.
1.
https://github.com/apache/hadoop/blob/194f0b49fb8ae3b876dde82f04f906f0ab0f5bdf/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsHandlerImpl.java#L267
The javadoc even says: "@return the first mount path that has the requested
subsystem"
2.
https://github.com/apache/hadoop/blob/194f0b49fb8ae3b876dde82f04f906f0ab0f5bdf/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/CgroupsLCEResourcesHandler.java#L445
This also returns one path per controller (the first found).
The old code behaved the same as my patch, as it only selected the first path
found for each controller, even if it was a not existing path.
Anyways, as the code is prepared to handle multiple paths for the same
controller (this is very unlikely to happen in reality, but I don't want to
alter the behaviour), I adjusted the
CGroupsControllerPaths#findControllerPathInMountConfig so that it returns the
first path that is readable.
So for example, if we have one unreadable (not existing) and one readable
(existing) path for one controller (e.g. cpu) then this method will always
return the readable path instead of returning null sometimes.
I carefully checked the test failures you referred to and I think this change
will fix all of them. I also tried to run all the testcases 50 times in
Intellij, so I'm pretty confident that the change actually fixes the test
failures :)
What do you think of the updated patch?
> Separate mtab file reader code and cgroups file system hierarchy parser code
> from CGroupsHandlerImpl and ResourceHandlerModule
> ------------------------------------------------------------------------------------------------------------------------------
>
> Key: YARN-9098
> URL: https://issues.apache.org/jira/browse/YARN-9098
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Szilard Nemeth
> Assignee: Szilard Nemeth
> Priority: Major
> Attachments: YARN-9098.002.patch, YARN-9098.003.patch,
> YARN-9098.004.patch, YARN-9098.005.patch
>
>
> Separate mtab file reader code and cgroups file system hierarchy parser code
> from CGroupsHandlerImpl and ResourceHandlerModule
> CGroupsHandlerImpl has a method parseMtab that parses an mtab file and stores
> cgroups data.
> CGroupsLCEResourcesHandler also has a method with the same name, with
> identical code.
> The parser code should be extracted from these places and be added in a new
> class as this is a separate responsibility.
> As the output of the file parser is a Map<String, Set<String>>, it's better
> to encapsulate it in a domain object, named 'CGroupsMountConfig' for instance.
> ResourceHandlerModule has a method named parseConfiguredCGroupPath, that is
> responsible for producing the same results (Map<String, Set<String>>) to
> store cgroups data, it does not operate on mtab file, but looking at the
> filesystem for cgroup settings. As the output is the same, CGroupsMountConfig
> should be used here, too.
> Again, this could should not be part of ResourceHandlerModule as it is a
> different responsibility.
> One more thing which is strongly related to the methods above is
> CGroupsHandlerImpl.initializeFromMountConfig: This method processes the
> result of a parsed mtab file or a parsed cgroups filesystem data and stores
> file system paths for all available controllers. This method invokes
> findControllerPathInMountConfig, which is a duplicated in CGroupsHandlerImpl
> and CGroupsLCEResourcesHandler, so it should be moved to a single place. To
> store filesystem path and controller mappings, a new domain object could be
> introduced.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]