[ 
https://issues.apache.org/jira/browse/YARN-7064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323135#comment-16323135
 ] 

Haibo Chen commented on YARN-7064:
----------------------------------

Thanks [~miklos.szeg...@cloudera.com] for the update! A few more comments on 
the new patch:

1) cgroupsLogged and cgroupsErrorLogged in ContainersMonitorImpl are no longer 
used, thus can be removed.  
2) CombinedResourceCalculator.initialize() should probably just call 
`cgroup.initialize() and procfs.initialize()` for easy maintenance. Can we call 
cgroup.getProcessTreeDump() in CombinedResourceCalculator.getProcessTreeDump() 
instead of returning null?
3) In CgroupsResourceCalculator, how about we give more information in 
initialize() when CGroupsResourceCalculator is not available to tells user what 
is required, like `CGroupsResourceCalculator is only available on Linux when 
cgroup memory and cpu is turned on`? In updateProcessTree() and 
getMemorySize(), I think not catching the YarnException would be more 
appropriate. The exception, if not caught in updateProcessTree() and 
getMemorySize(), will be eventually caught and logged in COntainersMonitorImpl 
which makes the error message easier to understand. Swallowing the exception in 
updateProcessTree() and getMemorySize() will lead old (for cpu usage) or wrong 
(for memory) number to be reported to ContainersMonitor, which is harder to 
debug. 

I will try the patch in a cluster in the meantime.

> Use cgroup to get container resource utilization
> ------------------------------------------------
>
>                 Key: YARN-7064
>                 URL: https://issues.apache.org/jira/browse/YARN-7064
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Miklos Szegedi
>            Assignee: Miklos Szegedi
>         Attachments: YARN-7064.000.patch, YARN-7064.001.patch, 
> YARN-7064.002.patch, YARN-7064.003.patch, YARN-7064.004.patch, 
> YARN-7064.005.patch, YARN-7064.007.patch, YARN-7064.008.patch, 
> YARN-7064.009.patch, YARN-7064.010.patch
>
>
> This is an addendum to YARN-6668. What happens is that that jira always wants 
> to rebase patches against YARN-1011 instead of trunk.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to