[
https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712210#comment-14712210
]
Varun Saxena commented on YARN-3816:
------------------------------------
[~djp], thanks for the patch. Few comments and questions.
# This pertains to what we are doing in YARN-4053. I see that we will be using
column qualifier postfix/suffix to identify if metric is an aggregated one or
not. In your case, this would mean an OR filter of the form metric=0 OR
metric0=1 while applying metric filters on reader side. We were thinking of
using similar scheme to identify a metric as long or double. If we use same
scheme for long or double, we may end up with 4 ORs' for a single metric. Maybe
we can use cell tags for aggregation. Or not support mixed data types. cc
[~jrottinghuis].
# IIUC, TimelineMetric#toAggrgate flag would indicate if a metric is to be
aggregated or not. Maybe in TimelineCollector#aggregateMetrics, we should do
aggregation only if the flag is enabled.
# In TimelineCollector#appendAggregatedMetricsToEntities any reason we are
creating separate TimelineEntity objects for each metric ? Maybe create a
single entity containing a set of metrics.
# 3 new maps have been introduced in TimelineCollector and these are used as
base to calculate aggregated value. What if the daemon crashes ?
# In TimelineMetricCalculator some functions have duplicate if conditions for
long.
# In TimelineMetricCalculator#sum, to avoid negative values due to overflow, we
can change conditions like below
{code}
if (n1 instanceof Integer){
return new Integer(n1.intValue() + n2.intValue());
}
{code}
to something like ?
{code}
if (n1 instanceof Integer){
if (Integer.MAX_VALUE - n1 - n2 < 0) {
return new Long(n1.longValue() + n2.longValue());
} else {
return new Integer(n1.intValue() + n2.intValue());
}
}
{code}
We need not support upto BigInteger or BigDecimal but as you said above, we can
throw exception for unsupported types.
# In TimelineMetric#aggregateTo, maybe use getValues instead of getValuesJAXB ?
# Also I was wondering if TimelineMetric#aggregateTo should be moved to some
util class. TimelineMetric is part of object model and exposed to client. And
IIUC aggregateTo wont be called by client.
# What is EntityColumnPrefix#AGGREGATED_METRICS meant for ?
> [Aggregation] App-level Aggregation for YARN system metrics
> -----------------------------------------------------------
>
> Key: YARN-3816
> URL: https://issues.apache.org/jira/browse/YARN-3816
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Junping Du
> Assignee: Junping Du
> Attachments: Application Level Aggregation of Timeline Data.pdf,
> YARN-3816-YARN-2928-v1.patch, YARN-3816-poc-v1.patch, YARN-3816-poc-v2.patch
>
>
> We need application level aggregation of Timeline data:
> - To present end user aggregated states for each application, include:
> resource (CPU, Memory) consumption across all containers, number of
> containers launched/completed/failed, etc. We need this for apps while they
> are running as well as when they are done.
> - Also, framework specific metrics, e.g. HDFS_BYTES_READ, should be
> aggregated to show details of states in framework level.
> - Other level (Flow/User/Queue) aggregation can be more efficient to be based
> on Application-level aggregations rather than raw entity-level data as much
> less raws need to scan (with filter out non-aggregated entities, like:
> events, configurations, etc.).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)