[
https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15246736#comment-15246736
]
Sangjin Lee commented on YARN-3816:
-----------------------------------
Thanks for updating the patch [~gtCarrera9]!
It appears that the unit test failure is caused by the patch. Could you please
resolve it? Also, the checkstyle violations and javadoc errors are related.
The latest patch looks good for the most part (minus the issues mentioned
above). The only thing that still gives me a pause is the name of the
aggregated metrics on the application. The current patch will produce metrics
with names such as "MEMORY_YARN_CONTAINER". As I mentioned in a previous
comment, I understand the rationale behind it (deduping). However, I wonder if
that is the best decision forward. An alternative would be to limit the
entity-to-app aggregation to YARN containers and drop the entity type. One of
the reasons I think that might be acceptable is because per-framework metrics
can be handled by AMs outside the context of this generic aggregation (see
[above|https://issues.apache.org/jira/browse/YARN-3816?focusedCommentId=15238407&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15238407]).
Would there be a compelling case where an entity-to-app aggregation needs to be
done for entities other than YARN containers?
> [Aggregation] App-level aggregation and accumulation for YARN system metrics
> ----------------------------------------------------------------------------
>
> Key: YARN-3816
> URL: https://issues.apache.org/jira/browse/YARN-3816
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Junping Du
> Assignee: Li Lu
> Labels: yarn-2928-1st-milestone
> Attachments: Application Level Aggregation of Timeline Data.pdf,
> YARN-3816-YARN-2928-v1.patch, YARN-3816-YARN-2928-v2.1.patch,
> YARN-3816-YARN-2928-v2.2.patch, YARN-3816-YARN-2928-v2.3.patch,
> YARN-3816-YARN-2928-v2.patch, YARN-3816-YARN-2928-v3.1.patch,
> YARN-3816-YARN-2928-v3.patch, YARN-3816-YARN-2928-v4.patch,
> YARN-3816-YARN-2928-v5.patch, YARN-3816-YARN-2928-v6.patch,
> YARN-3816-YARN-2928-v7.patch, YARN-3816-feature-YARN-2928.v4.1.patch,
> YARN-3816-poc-v1.patch, YARN-3816-poc-v2.patch
>
>
> We need application level aggregation of Timeline data:
> - To present end user aggregated states for each application, include:
> resource (CPU, Memory) consumption across all containers, number of
> containers launched/completed/failed, etc. We need this for apps while they
> are running as well as when they are done.
> - Also, framework specific metrics, e.g. HDFS_BYTES_READ, should be
> aggregated to show details of states in framework level.
> - Other level (Flow/User/Queue) aggregation can be more efficient to be based
> on Application-level aggregations rather than raw entity-level data as much
> less raws need to scan (with filter out non-aggregated entities, like:
> events, configurations, etc.).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)