[ https://issues.apache.org/jira/browse/YARN-7064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323135#comment-16323135 ]
Haibo Chen commented on YARN-7064: ---------------------------------- Thanks [~miklos.szeg...@cloudera.com] for the update! A few more comments on the new patch: 1) cgroupsLogged and cgroupsErrorLogged in ContainersMonitorImpl are no longer used, thus can be removed. 2) CombinedResourceCalculator.initialize() should probably just call `cgroup.initialize() and procfs.initialize()` for easy maintenance. Can we call cgroup.getProcessTreeDump() in CombinedResourceCalculator.getProcessTreeDump() instead of returning null? 3) In CgroupsResourceCalculator, how about we give more information in initialize() when CGroupsResourceCalculator is not available to tells user what is required, like `CGroupsResourceCalculator is only available on Linux when cgroup memory and cpu is turned on`? In updateProcessTree() and getMemorySize(), I think not catching the YarnException would be more appropriate. The exception, if not caught in updateProcessTree() and getMemorySize(), will be eventually caught and logged in COntainersMonitorImpl which makes the error message easier to understand. Swallowing the exception in updateProcessTree() and getMemorySize() will lead old (for cpu usage) or wrong (for memory) number to be reported to ContainersMonitor, which is harder to debug. I will try the patch in a cluster in the meantime. > Use cgroup to get container resource utilization > ------------------------------------------------ > > Key: YARN-7064 > URL: https://issues.apache.org/jira/browse/YARN-7064 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Miklos Szegedi > Assignee: Miklos Szegedi > Attachments: YARN-7064.000.patch, YARN-7064.001.patch, > YARN-7064.002.patch, YARN-7064.003.patch, YARN-7064.004.patch, > YARN-7064.005.patch, YARN-7064.007.patch, YARN-7064.008.patch, > YARN-7064.009.patch, YARN-7064.010.patch > > > This is an addendum to YARN-6668. What happens is that that jira always wants > to rebase patches against YARN-1011 instead of trunk. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org