Robert Kanter commented on YARN-2942:

Yes, it does a blocking wait.  I think this will end up being in a separate 
thread anyway because it's being done after uploading the logs to HDFS.  
However, I think making it a separate service is a good idea anyway.  As you 
said, this handles NM restart, and allows us to later add more flexibility.

If you upgrade the JHS before the NM, it's not the end of the world.  New logs 
wouldn't be found by the JHS, but that only hurts users trying to view those 
logs through the JHS.  Once the JHS is updated, they would be viewable.  In any 
case, having the two configs is probably more confusing than it needs to be for 
the user, and we'd have to take care of the case where the new format is 
disabled but concatenation is enabled (which is invalid).  I think we should 
just make this one config: the new format and concatenation is enabled or 
neither is.

I'll post an updated doc shortly.

> Aggregated Log Files should be combined
> ---------------------------------------
>                 Key: YARN-2942
>                 URL: https://issues.apache.org/jira/browse/YARN-2942
>             Project: Hadoop YARN
>          Issue Type: New Feature
>    Affects Versions: 2.6.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: CombinedAggregatedLogsProposal_v3.pdf, 
> CompactedAggregatedLogsProposal_v1.pdf, 
> CompactedAggregatedLogsProposal_v2.pdf, 
> ConcatableAggregatedLogsProposal_v4.pdf, YARN-2942-preliminary.001.patch, 
> YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch, 
> YARN-2942.003.patch
> Turning on log aggregation allows users to easily store container logs in 
> HDFS and subsequently view them in the YARN web UIs from a central place.  
> Currently, there is a separate log file for each Node Manager.  This can be a 
> problem for HDFS if you have a cluster with many nodes as you’ll slowly start 
> accumulating many (possibly small) files per YARN application.  The current 
> “solution” for this problem is to configure YARN (actually the JHS) to 
> automatically delete these files after some amount of time.  
> We should improve this by compacting the per-node aggregated log files into 
> one log file per application.

This message was sent by Atlassian JIRA

Reply via email to