[
https://issues.apache.org/jira/browse/YARN-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17288890#comment-17288890
]
D M Murali Krishna Reddy commented on YARN-8273:
------------------------------------------------
This Jira has induced the following issues.
# The {color:#000000}delService.delete(deletionTask){color} has been removed
from the for loop, and added at the end in finally block. Inside the for loop
we are creating FileDeletionTask for each container, but not storing it, due to
this, only the last container log files will be present in the deletionTask and
only those files will be removed. Ideally all the container log files which are
uploaded must be deleted.
# The LogAggregationDFSException is caught in the closeswriter, but when we
configure LogAggregationTFileController as logAggregationFileController,
this.logAggregationFileController.closeWriter() call itself calls closeWriter,
which throws LogAggregationDFSException if any, and the exception is not saved.
Again when we try to do closeWriter we dont get any exception and, we are not
throwing the LogAggregationDFSException in this scenario.
YARN-10648 is raised for fixing these issues
> Log aggregation does not warn if HDFS quota in target directory is exceeded
> ---------------------------------------------------------------------------
>
> Key: YARN-8273
> URL: https://issues.apache.org/jira/browse/YARN-8273
> Project: Hadoop YARN
> Issue Type: Bug
> Components: log-aggregation
> Affects Versions: 3.1.0
> Reporter: Gergo Repas
> Assignee: Gergo Repas
> Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8273.000.patch, YARN-8273.001.patch,
> YARN-8273.002.patch, YARN-8273.003.patch, YARN-8273.004.patch,
> YARN-8273.005.patch, YARN-8273.006.patch
>
>
> It appears that if an HDFS space quota is set on a target directory for log
> aggregation and the quota is already exceeded when log aggregation is
> attempted, zero-byte log files will be written to the HDFS directory, however
> NodeManager logs do not reflect a failure to write the files successfully
> (i.e. there are no ERROR or WARN messages to this effect).
> An improvement may be worth investigating to alert users to this scenario, as
> otherwise logs for a YARN application may be missing both on HDFS and locally
> (after local log cleanup is done) and the user may not otherwise be informed.
> Steps to reproduce:
> * Set a small HDFS space quota on /tmp/logs/username/logs (e.g. 2MB)
> * Write files to HDFS such that /tmp/logs/username/logs is almost 2MB full
> * Run a Spark or MR job in the cluster
> * Observe that zero byte files are written to HDFS after job completion
> * Observe that YARN container logs are also not present on the NM hosts (or
> are deleted after yarn.nodemanager.delete.debug-delay-sec)
> * Observe that no ERROR or WARN messages appear to be logged in the NM role
> log
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]