[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596129#comment-14596129
 ] 

Junping Du commented on YARN-3815:
----------------------------------

Thanks [~sjlee0] and [~jrottinghuis] for review and good comments in detail. 
[~jrottinghuis]'s comments are pretty long and I could only reply part of it 
and will finish the left parts tomorrow. :)

bq. For framework-specific metrics, I would say this falls on the individual 
frameworks. The framework AM usually already aggregates them in memory 
(consider MR job counters for example). So for them it is straightforward to 
write them out directly onto the YARN app entities. Furthermore, it is 
problematic to add them to the sub-app YARN entities and ask YARN to aggregate 
them to the application. Framework’s sub-app entities may not even align with 
YARN’s sub-app entities. For example, in case of MR, there is a reasonable 
one-to-one mapping between a mapper/reducer task attempt and a container, but 
for other applications that may not be true. Forcing all frameworks to hang 
values at containers may not be practical. I think it’s far easier for 
frameworks to write aggregated values to the YARN app entities.
AM currently leverage YARN's AppTimelineCollector to forward entities to 
backend storage, so making AM talk directly to backend storage is not 
considered to be safe. It is also not necessary too because the real difficulty 
here is to aggregate framework specific metrics in other levels (flow, user and 
queue), because that beyond the life cycle of framework so YARN have to take 
care of it. Instead of asking frameworks to handle specific metrics themselves, 
I would like to propose to treat these metrics as "anonymous", it would pass 
both metrics name and value to YARN's collector and YARN's collector could 
aggregate it and store as dynamic column (under framework_specific_metrics 
column family) into app states table. So other (flow, user, etc.) level 
aggregation on freamework metrics could happen based on this.

bq. app-to-flow online aggregation. This is more or less live aggregated 
metrics at the flow level. This will still be based on the native HBase schema.
About flow online aggregation, I am not quite sure on requirement yet. Do we 
really want real time for flow aggregated data or some fine-grained time 
interval (like 15 secs) should be good enough - if we want to show some nice 
metrics chart for flow, this should be fine. Even for real time, we don't have 
to aggregate everything from raw entity table, we don't have to duplicated 
count metrics again for finished apps. Isn't it?

bq. (3) time-based flow aggregation: This is different than the online 
aggregation in the sense that it is aggregated along the time boundary (e.g. 
“daily”, “weekly”, etc.). This can be based on the Phoenix schema. This can be 
populated in an offline fashion (e.g. running a mapreduce job).
Any special reason not to handle it in the same way above - as HBase 
coprocessor? It just sound like gross-grained time interval. Isn't it?

bq. This is another “offline” aggregation type. Also, I believe we’re talking 
about only time-based aggregation. In other words, we would aggregate values 
for users only with a well-defined time window. There won’t be a “real-time” 
aggregation of values, similar to the flow aggregation.
I would also call for a fine-grained time interval (closed to real-time) 
because the aggregated resource metrics on user could be used in billing hadoop 
usage in a shared environment (no matter private or public cloud), so user need 
to know more details on resource consumption especially in some random peak 
time.

bq. Very much agree with separation into 2 categories "online" versus 
"periodic". I think this will be natural split between the native HBase tables 
for the former and the Phoenix approach for the latter to each emphasize their 
relative strengths.
I would question the necessary for "online" again if this mean "real time" 
instead of fine-grained time interval. Actually, as a building block, every 
container metrics (cpu, memory, etc.) are generated in a time interval instead 
of real time. As a result, we never know the exactly snapshot of whole system 
in a precisely time but only can try to getting closer.


> [Aggregation] Application/Flow/User/Queue Level Aggregations
> ------------------------------------------------------------
>
>                 Key: YARN-3815
>                 URL: https://issues.apache.org/jira/browse/YARN-3815
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Junping Du
>            Assignee: Junping Du
>            Priority: Critical
>         Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to