[
https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716497#comment-14716497
]
Varun Saxena commented on YARN-3816:
------------------------------------
Thanks [~djp] for the replies.
bq. This will be part of failed over JIRAs
Ok.
bq. I would prefer to use TreeMap because it sort key (timestamp) when
accessing it. aggregateTo() algorithm assume metrics are sorted by timestamp.
Hmm...Both getValues and getValuesJAXB return the same map but didnt notice the
return types. So will have to typecast return value from getValues to use
methods specific to TreeMap. In that case, I guess its fine to use
getValuesJAXB
bq. aggregateTo is not straighfoward and generic useful like methods in
TimelineMetricCalculator, so let's hold on to expose it as utility class for
now. Make it static sounds good though.
Ok.
I had one more question which you missed.
While TimelineMetric#toAggregate flag is meant to indicate if a metric needs to
be aggregated. But are we planning to use it to indicate that a metric is an
aggregated metric as well ? If yes, we should probably set this flag for each
metric processed in TimelineCollector#appendAggregatedMetricsToEntities.
As Li said above will we be differentiating aggregated metrics from non
aggregated ones ?
> [Aggregation] App-level Aggregation for YARN system metrics
> -----------------------------------------------------------
>
> Key: YARN-3816
> URL: https://issues.apache.org/jira/browse/YARN-3816
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Junping Du
> Assignee: Junping Du
> Attachments: Application Level Aggregation of Timeline Data.pdf,
> YARN-3816-YARN-2928-v1.patch, YARN-3816-poc-v1.patch, YARN-3816-poc-v2.patch
>
>
> We need application level aggregation of Timeline data:
> - To present end user aggregated states for each application, include:
> resource (CPU, Memory) consumption across all containers, number of
> containers launched/completed/failed, etc. We need this for apps while they
> are running as well as when they are done.
> - Also, framework specific metrics, e.g. HDFS_BYTES_READ, should be
> aggregated to show details of states in framework level.
> - Other level (Flow/User/Queue) aggregation can be more efficient to be based
> on Application-level aggregations rather than raw entity-level data as much
> less raws need to scan (with filter out non-aggregated entities, like:
> events, configurations, etc.).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)