[
https://issues.apache.org/jira/browse/YARN-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025985#comment-14025985
]
Carlo Curino commented on YARN-2141:
------------------------------------
This is the product of conversations with [~chris.douglas], [~hitesh],
[~sriramsrao]. For difference reasons we were all looking into having the YARN
infrastructure collect per-container resource consumption information, and
making those available to infrastructure and user code. The use case we were
looking into was to have the scheduler operate in a resource-aware fashion,
lowering pressure on nodes that are struggling. Hitesh was suggesting the
creation of a "top" like interface that allows admins/users to spot resource
hogging jobs.
The best place to affect this I think is the ContainerMonitor within the
NodeManager, as this process is already looping over all running containers,
and walking /proc to gather memory utilization information. I think extending
this to also gather CPU and IO information should be trivial.
I had a version doing this hackishly by using the ProcessTreeInfo (possibly
extended), and logging straight to HDFS on every iteration of the
ContainerMonitor. The same information was also added to an extra field of the
NodeManager-ResourceManager heartbeat to propagate this to the
scheduler for live use.
{code:title=ContainerMonitorImpl.java |borderStyle=solid}
private void logState(PrintWriter out, ContainerId cid, ProcessTreeInfo
pti) {
if (out != null) {
if (cid != null && pti != null) {
String s = System.currentTimeMillis() + ","
+ context.getNodeId().getHost() + ","
+ cid.getApplicationAttemptId() + "," + cid.getId() + "," + ct
+ "," + pti.getPID() + "," + pti.getVmemLimit() + ","
+ pti.getPmemLimit();
if (pti.getProcessTree() != null) {
s = s + "," + pti.getProcessTree().getCumulativeVmem() + ","
+ pti.getProcessTree().getCumulativeRssmem(1) + ","
+ pti.getProcessTree().getCumulativeCpuTime() + ","
+ pti.getProcessTree().getIORead() + ","
+ pti.getProcessTree().getIOWrite();
}
s = s + "\n";
out.print(s);
}
}
{code}
The advantage of dumping this information to HDFS is that is likely easy to
process using Hive or the alike. This would allow to build "models" of the
tasks behavior, which has boundless value to understand our workloads, affect
infrastructure optimizations, inform users on their task behavior (e.g., skew
in computations/IO etc..).
Hitesh was proposing a much more elegant version pushing Timeline mechanisms.
I don't think there is great value in the code I played with, but the concept
(especially in the version proposed by Hitesh) I think is rather compelling,
and likely enable cool work at infrastructure and user level.
Thoughts? Volunteers?
> Resource consumption logging
> ----------------------------
>
> Key: YARN-2141
> URL: https://issues.apache.org/jira/browse/YARN-2141
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: nodemanager
> Reporter: Carlo Curino
> Priority: Minor
>
> Collecting per-container and per-node resource consumption statistics in a
> fairly granular manner, and making them available to both infrastructure code
> (e.g., schedulers) and users (e.g., AMs or directly users via webapps), can
> facilitate several performance work.
--
This message was sent by Atlassian JIRA
(v6.2#6252)