Robert Kanter commented on YARN-2942:

The log delays in aggregation should be fine if the logs are available directly 
from the NMs in the meantime.

{quote}Like I was originally saying, we really need all of this functionality 
in the file-system.{quote}
I agree with you that this would be great and make things super easy for us.  
However, I don't see them adding the functionality that we need any time soon.  
Given that, I think we need to come up with our own solution with what we have 
currently available.

{quote}Overall, today's log-aggregation is fairly on the edge...we need to 
think twice before hard-wiring the notion of concurrent log-append right into 
the platform. The ZK solution was less intrusive as it was still on the edge 
with the downside of adding external dependencies.{quote}
I think that the v7 design could also be easily replaced as well.  Most of it 
would live in an RM service, which could be turned off or replaced.  However, 
you are correct that the design based on what Jason said would be more invasive 
and not really replaceable.  That said, I don't think anyone's wanted/tried to 
do that.

> Aggregated Log Files should be combined
> ---------------------------------------
>                 Key: YARN-2942
>                 URL: https://issues.apache.org/jira/browse/YARN-2942
>             Project: Hadoop YARN
>          Issue Type: New Feature
>    Affects Versions: 2.6.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: CombinedAggregatedLogsProposal_v3.pdf, 
> CombinedAggregatedLogsProposal_v6.pdf, CombinedAggregatedLogsProposal_v7.pdf, 
> CompactedAggregatedLogsProposal_v1.pdf, 
> CompactedAggregatedLogsProposal_v2.pdf, 
> ConcatableAggregatedLogsProposal_v4.pdf, 
> ConcatableAggregatedLogsProposal_v5.pdf, YARN-2942-preliminary.001.patch, 
> YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch, 
> YARN-2942.003.patch
> Turning on log aggregation allows users to easily store container logs in 
> HDFS and subsequently view them in the YARN web UIs from a central place.  
> Currently, there is a separate log file for each Node Manager.  This can be a 
> problem for HDFS if you have a cluster with many nodes as you’ll slowly start 
> accumulating many (possibly small) files per YARN application.  The current 
> “solution” for this problem is to configure YARN (actually the JHS) to 
> automatically delete these files after some amount of time.  
> We should improve this by compacting the per-node aggregated log files into 
> one log file per application.

This message was sent by Atlassian JIRA

Reply via email to