[ https://issues.apache.org/jira/browse/YARN-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16478207#comment-16478207 ]
Robert Kanter commented on YARN-8273: ------------------------------------- Thanks for the patch. One question: - Why make {{LogAggregationDFSException}} an undeclared exception? Why not make it a subclass of {{YarnException}} and declare it? -- Also, if we do keep it as undeclared, it should be a subclass of {{YarnRuntimeException}} instead of {{RuntimeException}}. I was going to suggest we catch the {{DSQuotaExceededException}} when closing {{writer}}, but it turns out that {{TFile#close}} does _not_ close the underlying {{FSDataOutputStream}}. That's probably not what most people are expecting, but there's nothing we can do about that now. :/ > Log aggregation does not warn if HDFS quota in target directory is exceeded > --------------------------------------------------------------------------- > > Key: YARN-8273 > URL: https://issues.apache.org/jira/browse/YARN-8273 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation > Affects Versions: 3.1.0 > Reporter: Gergo Repas > Assignee: Gergo Repas > Priority: Major > Attachments: YARN-8273.000.patch, YARN-8273.001.patch, > YARN-8273.002.patch > > > It appears that if an HDFS space quota is set on a target directory for log > aggregation and the quota is already exceeded when log aggregation is > attempted, zero-byte log files will be written to the HDFS directory, however > NodeManager logs do not reflect a failure to write the files successfully > (i.e. there are no ERROR or WARN messages to this effect). > An improvement may be worth investigating to alert users to this scenario, as > otherwise logs for a YARN application may be missing both on HDFS and locally > (after local log cleanup is done) and the user may not otherwise be informed. > Steps to reproduce: > * Set a small HDFS space quota on /tmp/logs/username/logs (e.g. 2MB) > * Write files to HDFS such that /tmp/logs/username/logs is almost 2MB full > * Run a Spark or MR job in the cluster > * Observe that zero byte files are written to HDFS after job completion > * Observe that YARN container logs are also not present on the NM hosts (or > are deleted after yarn.nodemanager.delete.debug-delay-sec) > * Observe that no ERROR or WARN messages appear to be logged in the NM role > log -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org