[
https://issues.apache.org/jira/browse/YARN-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568742#comment-16568742
]
Prabhu Joseph commented on YARN-8617:
-------------------------------------
[~bibinchundatt]
yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds is already
set to 3600. Will explain the issue clearly.
A Long Running Container runs on node1 writes the daemon.log and rotates -
daemon.log.1, daemon.log.2 etc, Yarn Log Aggregation happens for rolled log
files as expected and combines all files for that node and place it under hdfs
path in a log file called node1 file.
{code}
[hdfs@node2 tmp]$ hadoop fs -ls
/app-logs/hive/logs-ifile/application_1533261669437_0001/node3.openstack_45454_1533315091898
-rw-r----- 3 hive hadoop 29207 2018-08-03 19:51
/app-logs/hive/logs-ifile/application_1533261669437_0001/node3.openstack_45454_1533315091898
{code}
The AggrgeatedLogDeletionService does deletion for Running Job based upon the
file modification time which always will be latest as the rolled logs are
getting updated into the node1 file regularly. This does not delete the older
log contents daemon.log.2 etc part of node1 file. This will cause the node1
file to accumulate when a container is always going to be running on that node.
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L129
{code}
for (FileStatus node : logFiles) {
if (node.getModificationTime() < cutoffMillis) {
try {
fs.delete(node.getPath(), true);
} catch (IOException ex) {
logException("Could not delete " + appDir.getPath(), ex);
}
}
{code}
> Aggregated Application Logs accumulates for long running jobs
> -------------------------------------------------------------
>
> Key: YARN-8617
> URL: https://issues.apache.org/jira/browse/YARN-8617
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: log-aggregation
> Affects Versions: 2.7.4
> Reporter: Prabhu Joseph
> Priority: Major
>
> Currently AggregationDeletionService will delete older aggregated log files
> once when they are complete. This will cause logs to accumulate for Long
> Running Jobs like Llap, Spark Streaming.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]