[
https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14553384#comment-14553384
]
Karthik Kambatla commented on YARN-2942:
----------------------------------------
Thanks everyone for the discussion. Clearly, there are trade-offs to make
between (1) a single aggregation across nodes for an application with a
slightly higher chance of losing a container's logs if a node were to go down
vs (2) a two-step aggregation that places more load on HDFS. While looking at
this trade-off, we should consider HDFS state today and possible improvements
in the future. If HDFS were to support concurrent-append, option 1 seems like a
better approach.
> Aggregated Log Files should be combined
> ---------------------------------------
>
> Key: YARN-2942
> URL: https://issues.apache.org/jira/browse/YARN-2942
> Project: Hadoop YARN
> Issue Type: New Feature
> Affects Versions: 2.6.0
> Reporter: Robert Kanter
> Assignee: Robert Kanter
> Attachments: CombinedAggregatedLogsProposal_v3.pdf,
> CombinedAggregatedLogsProposal_v6.pdf, CombinedAggregatedLogsProposal_v7.pdf,
> CompactedAggregatedLogsProposal_v1.pdf,
> CompactedAggregatedLogsProposal_v2.pdf,
> ConcatableAggregatedLogsProposal_v4.pdf,
> ConcatableAggregatedLogsProposal_v5.pdf, YARN-2942-preliminary.001.patch,
> YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch,
> YARN-2942.003.patch
>
>
> Turning on log aggregation allows users to easily store container logs in
> HDFS and subsequently view them in the YARN web UIs from a central place.
> Currently, there is a separate log file for each Node Manager. This can be a
> problem for HDFS if you have a cluster with many nodes as you’ll slowly start
> accumulating many (possibly small) files per YARN application. The current
> “solution” for this problem is to configure YARN (actually the JHS) to
> automatically delete these files after some amount of time.
> We should improve this by compacting the per-node aggregated log files into
> one log file per application.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)