[ 
https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540970#comment-14540970
 ] 

Vinod Kumar Vavilapalli commented on YARN-2942:
-----------------------------------------------

bq. Having the RM coordinate the aggregation is similar to my design with ZK, 
but instead of a ZK lock, the RM orchestrates things. I like the idea of 
getting rid of the original aggregation and having the NMs all write to HDFS 
once, in the combined file directly.
Though this is great to have in theory, I'd like to point out that the 
implementation is going to be fraught with (1) many fault-tolerance conditions 
and (2) potentially very long delays in aggregation due to costs of 
coordination and fault-recovery. Like I was [originally 
saying|https://issues.apache.org/jira/browse/YARN-2942?focusedCommentId=14326912&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14326912],
 we really need all of this functionality in the file-system.

Overall, today's log-aggregation is fairly on the edge (you can imagine putting 
in a different aggregation mechanism by replacing the module present in the 
NM); we need to think twice before hard-wiring the notion of concurrent 
log-append right into the platform. The ZK solution was less intrusive as it 
was still on the edge with the downside of adding external dependencies.

> Aggregated Log Files should be combined
> ---------------------------------------
>
>                 Key: YARN-2942
>                 URL: https://issues.apache.org/jira/browse/YARN-2942
>             Project: Hadoop YARN
>          Issue Type: New Feature
>    Affects Versions: 2.6.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: CombinedAggregatedLogsProposal_v3.pdf, 
> CombinedAggregatedLogsProposal_v6.pdf, CombinedAggregatedLogsProposal_v7.pdf, 
> CompactedAggregatedLogsProposal_v1.pdf, 
> CompactedAggregatedLogsProposal_v2.pdf, 
> ConcatableAggregatedLogsProposal_v4.pdf, 
> ConcatableAggregatedLogsProposal_v5.pdf, YARN-2942-preliminary.001.patch, 
> YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch, 
> YARN-2942.003.patch
>
>
> Turning on log aggregation allows users to easily store container logs in 
> HDFS and subsequently view them in the YARN web UIs from a central place.  
> Currently, there is a separate log file for each Node Manager.  This can be a 
> problem for HDFS if you have a cluster with many nodes as you’ll slowly start 
> accumulating many (possibly small) files per YARN application.  The current 
> “solution” for this problem is to configure YARN (actually the JHS) to 
> automatically delete these files after some amount of time.  
> We should improve this by compacting the per-node aggregated log files into 
> one log file per application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to