[ 
https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15246736#comment-15246736
 ] 

Sangjin Lee commented on YARN-3816:
-----------------------------------

Thanks for updating the patch [~gtCarrera9]!

It appears that the unit test failure is caused by the patch. Could you please 
resolve it? Also, the checkstyle violations and javadoc errors are related.

The latest patch looks good for the most part (minus the issues mentioned 
above). The only thing that still gives me a pause is the name of the 
aggregated metrics on the application. The current patch will produce metrics 
with names such as "MEMORY_YARN_CONTAINER". As I mentioned in a previous 
comment, I understand the rationale behind it (deduping). However, I wonder if 
that is the best decision forward. An alternative would be to limit the 
entity-to-app aggregation to YARN containers and drop the entity type. One of 
the reasons I think that might be acceptable is because per-framework metrics 
can be handled by AMs outside the context of this generic aggregation (see 
[above|https://issues.apache.org/jira/browse/YARN-3816?focusedCommentId=15238407&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15238407]).

Would there be a compelling case where an entity-to-app aggregation needs to be 
done for entities other than YARN containers?

> [Aggregation] App-level aggregation and accumulation for YARN system metrics
> ----------------------------------------------------------------------------
>
>                 Key: YARN-3816
>                 URL: https://issues.apache.org/jira/browse/YARN-3816
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Junping Du
>            Assignee: Li Lu
>              Labels: yarn-2928-1st-milestone
>         Attachments: Application Level Aggregation of Timeline Data.pdf, 
> YARN-3816-YARN-2928-v1.patch, YARN-3816-YARN-2928-v2.1.patch, 
> YARN-3816-YARN-2928-v2.2.patch, YARN-3816-YARN-2928-v2.3.patch, 
> YARN-3816-YARN-2928-v2.patch, YARN-3816-YARN-2928-v3.1.patch, 
> YARN-3816-YARN-2928-v3.patch, YARN-3816-YARN-2928-v4.patch, 
> YARN-3816-YARN-2928-v5.patch, YARN-3816-YARN-2928-v6.patch, 
> YARN-3816-YARN-2928-v7.patch, YARN-3816-feature-YARN-2928.v4.1.patch, 
> YARN-3816-poc-v1.patch, YARN-3816-poc-v2.patch
>
>
> We need application level aggregation of Timeline data:
> - To present end user aggregated states for each application, include: 
> resource (CPU, Memory) consumption across all containers, number of 
> containers launched/completed/failed, etc. We need this for apps while they 
> are running as well as when they are done.
> - Also, framework specific metrics, e.g. HDFS_BYTES_READ, should be 
> aggregated to show details of states in framework level.
> - Other level (Flow/User/Queue) aggregation can be more efficient to be based 
> on Application-level aggregations rather than raw entity-level data as much 
> less raws need to scan (with filter out non-aggregated entities, like: 
> events, configurations, etc.).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to