[
https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231626#comment-15231626
]
Sangjin Lee commented on YARN-3816:
-----------------------------------
Onto code-level comments...
First, there seem to be checkstyle violations and javadoc errors. Could you
please fix them?
(RealTimeAggregationOperation.java)
- As mentioned in the above comment, this really appears to be about
"accumulation". We should rename things here to "accumulation".
- l.36: We don’t need to update {{state}} for MAX? Could you explain how
{{state}} is supposed to be used?
- I don’t think I understand {{SUM.exec()}}. Maybe some comment in the code (or
a JIRA comment) could be helpful.
- l.116: There is no need for a separate interface ({{Operation}}). The
{{exec()}} method can simply belong in {{RealTimeAggregationOperation}} itself.
(TimelineMetric.java)
- l.105: This is an unrelated issue with this patch, but I’m not sure what’s
going on with the else clause in l.104-106 in the {{setValues()}} method. Could
you look at it and fix it if it is not right?
- l.183: we should use {{StringBuilder}} (unsynchronized) over {{StringBuffer}}
(synchronized)
- l.191: I would say use “get” instead of “retrieve” for these method names...
- l.192: nit: since this is an enum, {{==}} is sufficient (no need for
{{equals()}}); the same for l.206 and 220
- l.196: It should be {{firstKey()}} because it’s reverse sorted, right? We’re
looking for the latest timestamp.
- l.205: the name “key” is bit obscure. What we mean is the timestamp for the
value. Should we rename this to {{getSingleDataTimestamp()}}?
(TimelineMetricCalculator.java)
- l.38: typo: “Number be compared” -> “Number to be compared”. The same with
l.71
- l.41: nit: need a space before the opening brace
- l.76: same as above
- l.68: We stated that we will support only longs as the metric value type for
now (and maybe double later). In any case, I think it’s safe to say we need not
support ints. Should we simplify this by casting ints to longs if we see them?
- l.109: do we need to check for both being null?
- l.145: I think we should check to ensure time > 0. Also, it might be easier
if we specify time as {{long}} instead of {{Long}}.
- l.151: wouldn’t it be easier if we called {{sum()}} to handle the summation
part instead of implementing the summing logic here again?
- l.194: nit: space before the brace
(TimelineCollector.java)
- l.59-69: nit: let’s group all statics at the beginning and place instance
members after them
- the executor should be shut down properly in {{serviceStop()}}, or it will
leave those threads hanging around
- l.129: nit: we don’t need to specify {{TimelineCollector}} in calling the
static methods (in several places here)
- l.218: nit: let’s surround it with {{LOG.isDebugEnabled()}}
- l.237-241: This is bit of an anti-pattern for using a {{ConcurrentHashMap}}.
The issue is if multiple threads find that {{aggrRow}} is null and try to put
their copies to the {{aggregateTable}} map, there is a race. As a result, you
may start operating on an instance that will not be stored in the map
eventually. You should use the {{putIfAbsent()}} method to make sure multiple
threads always agree on the stored instance after the operation.
- l.247: nit: let’s use ==
- l.258: nit: let’s use ==
(TimelineReaderWebServices.java)
- Are the imports needed? There are no other code changes in this file?
> [Aggregation] App-level aggregation and accumulation for YARN system metrics
> ----------------------------------------------------------------------------
>
> Key: YARN-3816
> URL: https://issues.apache.org/jira/browse/YARN-3816
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Junping Du
> Assignee: Li Lu
> Labels: yarn-2928-1st-milestone
> Attachments: Application Level Aggregation of Timeline Data.pdf,
> YARN-3816-YARN-2928-v1.patch, YARN-3816-YARN-2928-v2.1.patch,
> YARN-3816-YARN-2928-v2.2.patch, YARN-3816-YARN-2928-v2.3.patch,
> YARN-3816-YARN-2928-v2.patch, YARN-3816-YARN-2928-v3.1.patch,
> YARN-3816-YARN-2928-v3.patch, YARN-3816-YARN-2928-v4.patch,
> YARN-3816-YARN-2928-v5.patch, YARN-3816-feature-YARN-2928.v4.1.patch,
> YARN-3816-poc-v1.patch, YARN-3816-poc-v2.patch
>
>
> We need application level aggregation of Timeline data:
> - To present end user aggregated states for each application, include:
> resource (CPU, Memory) consumption across all containers, number of
> containers launched/completed/failed, etc. We need this for apps while they
> are running as well as when they are done.
> - Also, framework specific metrics, e.g. HDFS_BYTES_READ, should be
> aggregated to show details of states in framework level.
> - Other level (Flow/User/Queue) aggregation can be more efficient to be based
> on Application-level aggregations rather than raw entity-level data as much
> less raws need to scan (with filter out non-aggregated entities, like:
> events, configurations, etc.).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)