Robert Kanter updated YARN-2942:
    Attachment: YARN-2942.003.patch

The YARN-2942.003.patch fixes some minor problems I found when dealing with 
logs for long running applications:
- The JHS would correctly display the logs, but also show a message that they 
couldn't be found
- The NM wasn't trying to compact the long running logs (which is expected), 
but it was dumping an ugly error message to it's log about it.  It now checks 
that the "normal" aggregated log file exists before trying to read it to 
prevent that.  I also made it so that it won't even try to get the lock if it's 
aggregated file is not there, which is better.

> Aggregated Log Files should be compacted
> ----------------------------------------
>                 Key: YARN-2942
>                 URL: https://issues.apache.org/jira/browse/YARN-2942
>             Project: Hadoop YARN
>          Issue Type: New Feature
>    Affects Versions: 2.6.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: CompactedAggregatedLogsProposal_v1.pdf, 
> CompactedAggregatedLogsProposal_v2.pdf, YARN-2942-preliminary.001.patch, 
> YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch, 
> YARN-2942.003.patch
> Turning on log aggregation allows users to easily store container logs in 
> HDFS and subsequently view them in the YARN web UIs from a central place.  
> Currently, there is a separate log file for each Node Manager.  This can be a 
> problem for HDFS if you have a cluster with many nodes as you’ll slowly start 
> accumulating many (possibly small) files per YARN application.  The current 
> “solution” for this problem is to configure YARN (actually the JHS) to 
> automatically delete these files after some amount of time.  
> We should improve this by compacting the per-node aggregated log files into 
> one log file per application.

This message was sent by Atlassian JIRA

Reply via email to