[ https://issues.apache.org/jira/browse/YARN-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254624#comment-15254624 ]
Haibo Chen commented on YARN-4766: ---------------------------------- @Robert Kanter, thanks very much for your review. I have addressed all issues in the latest patch. For #6, I didn't follow exactly your comments. Instead, a new method that takes configs and expected files. testAggregatorWithRetentionPolicyDisabled_shouldUploadAllFiles and testAggregatorWhenNoFileOlderThanRetentionPolicy_ShouldUploadAll are still very much alike, but most of the code duplication is removed. > NM should not aggregate logs older than the retention policy > ------------------------------------------------------------ > > Key: YARN-4766 > URL: https://issues.apache.org/jira/browse/YARN-4766 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation, nodemanager > Reporter: Haibo Chen > Assignee: Haibo Chen > Attachments: yarn4766.001.patch, yarn4766.002.patch, > yarn4766.003.patch > > > When a log aggregation fails on the NM the information is for the attempt is > kept in the recovery DB. Log aggregation can fail for multiple reasons which > are often related to HDFS space or permissions. > On restart the recovery DB is read and if an application attempt needs its > logs aggregated, the files are scheduled for aggregation without any checks. > The log files could be older than the retention limit in which case we should > not aggregate them but immediately mark them for deletion from the local file > system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)