Carlo Curino commented on YARN-2141:

This is the product of conversations with [~chris.douglas], [~hitesh], 
[~sriramsrao]. For difference reasons we were all looking into having the YARN 
infrastructure collect per-container resource consumption information, and 
making those available to infrastructure and user code. The use case we were 
looking into was to have the scheduler operate in a resource-aware fashion, 
lowering pressure on nodes that are struggling. Hitesh was suggesting the 
creation of a "top" like interface that allows admins/users to spot resource 
hogging jobs.

The best place to affect this I think is the ContainerMonitor within the 
NodeManager, as this process is already looping over all running containers, 
and walking /proc to gather memory utilization information. I think extending 
this to also gather CPU and IO information should be trivial. 
I had a version doing this hackishly by using the ProcessTreeInfo (possibly 
extended), and logging straight to HDFS on every iteration of the 
ContainerMonitor.  The same information was also added to an extra field of the 
NodeManager-ResourceManager heartbeat to propagate this to the
scheduler for live use. 

{code:title=ContainerMonitorImpl.java |borderStyle=solid}
    private void logState(PrintWriter out, ContainerId cid, ProcessTreeInfo 
pti) {
     if (out != null) {
        if (cid != null && pti != null) {
          String s = System.currentTimeMillis() + ","
              + context.getNodeId().getHost() + ","
              + cid.getApplicationAttemptId() + "," + cid.getId() + "," + ct
              + "," + pti.getPID() + "," + pti.getVmemLimit() + ","
              + pti.getPmemLimit();

          if (pti.getProcessTree() != null) {
            s = s + "," + pti.getProcessTree().getCumulativeVmem() + ","
                + pti.getProcessTree().getCumulativeRssmem(1) + ","
                + pti.getProcessTree().getCumulativeCpuTime() + ","
                + pti.getProcessTree().getIORead() + ","
                + pti.getProcessTree().getIOWrite();
          s = s + "\n";

The advantage of dumping this information to HDFS is that is likely easy to 
process using Hive or the alike. This would allow to build "models" of the 
tasks behavior, which has boundless value to understand our workloads, affect 
infrastructure optimizations, inform users on their task behavior (e.g., skew 
in computations/IO etc..). 

Hitesh was proposing a much more elegant version pushing Timeline mechanisms.  
I don't think there is great value in the code I played with, but the concept 
(especially in the version proposed by Hitesh) I think is rather compelling,
and likely enable cool work at infrastructure and user level. 

Thoughts? Volunteers?

> Resource consumption logging
> ----------------------------
>                 Key: YARN-2141
>                 URL: https://issues.apache.org/jira/browse/YARN-2141
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>            Reporter: Carlo Curino
>            Priority: Minor
> Collecting per-container and per-node resource consumption statistics in a 
> fairly granular manner, and making them available to both infrastructure code 
> (e.g., schedulers) and users (e.g., AMs or directly users via webapps), can 
> facilitate several performance work. 

This message was sent by Atlassian JIRA

Reply via email to