[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612591#comment-14612591
 ] 

Junping Du commented on YARN-3815:
----------------------------------

Thanks [~sjlee0] for nice writeup on the discussions.
Looks good for most parts to me. Some comments on app level aggregations:

bq. Framework‐specific metrics will be sent to the per‐app collector aggregated 
by the AM itself.
We may consider to provide two ways here:
- For legacy applications - like MR, AM already have done aggregation on these 
counters themselves.
- For new application to build against YARN after timeline service v2, AM can 
delegate YARN timeline service to do aggregation instead of do it themselves. 
Our data model and aggregation mechanism should assure YARN timeline service 
can aggregate these framework-specif metrics without get predefined.

bq. time average & max: the average multiplied by the elapsed time of the 
application represents the total resource usage over time.
This way sounds very clever. In addition, if we need resource consumption at 
any standpoint or time window (t1 - t2), we can simply do Avg(t2) * t2 - 
Avg(t1) * t1. This is much better than aggregating value on each stand point 
when query.


> [Aggregation] Application/Flow/User/Queue Level Aggregations
> ------------------------------------------------------------
>
>                 Key: YARN-3815
>                 URL: https://issues.apache.org/jira/browse/YARN-3815
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Junping Du
>            Assignee: Junping Du
>            Priority: Critical
>         Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf, aggregation-design-discussion.pdf, 
> hbase-schema-proposal-for-aggregation.pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to