Sangjin Lee commented on YARN-3815:

We may consider to provide two ways here:
- For legacy applications - like MR, AM already have done aggregation on these 
counters themselves.
- For new application to build against YARN after timeline service v2, AM can 
delegate YARN timeline service to do aggregation instead of do it themselves. 
Our data model and aggregation mechanism should assure YARN timeline service 
can aggregate these framework-specif metrics without get predefined.

I think it's a little more complicated than that. If a new YARN application 
wants to delegate aggregation to the YARN timeline service, it still needs to 
do at least the following:
- add the framework-specific metrics to the YARN container
- do *not* add any of those metrics to the YARN application

The framework-specific metrics set on the containers would still be transmitted 
by the AM (not by the node managers). Then, the YARN timeline service could 
look at *any* container metrics and apply the uniform aggregation rules.

Hopefully YARN apps can add metric values to container entities (there should 
be a natural mapping from unit of work to containers), otherwise it won't work 
for them...

I think it is pretty natural and straightforward for AMs to aggregate and 
retain values at the app level, but even if they set it at the container level, 
it could work.

On the other hand, if your app wants to own aggregation, then it should not set 
the metrics on the containers, or it would be done twice.

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> ------------------------------------------------------------
>                 Key: YARN-3815
>                 URL: https://issues.apache.org/jira/browse/YARN-3815
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Junping Du
>            Assignee: Junping Du
>            Priority: Critical
>         Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf, aggregation-design-discussion.pdf, 
> hbase-schema-proposal-for-aggregation.pdf
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 

This message was sent by Atlassian JIRA

Reply via email to