[ 
https://issues.apache.org/jira/browse/YARN-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16483871#comment-16483871
 ] 

Gergo Repas commented on YARN-8273:
-----------------------------------

[~rkanter] Thanks for the review! Yes, indeed LogAggregationDFSException can be 
a checked exception (and a subclass of YarnException), I've updated the patch.

> Log aggregation does not warn if HDFS quota in target directory is exceeded
> ---------------------------------------------------------------------------
>
>                 Key: YARN-8273
>                 URL: https://issues.apache.org/jira/browse/YARN-8273
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: log-aggregation
>    Affects Versions: 3.1.0
>            Reporter: Gergo Repas
>            Assignee: Gergo Repas
>            Priority: Major
>         Attachments: YARN-8273.000.patch, YARN-8273.001.patch, 
> YARN-8273.002.patch, YARN-8273.003.patch, YARN-8273.004.patch, 
> YARN-8273.005.patch, YARN-8273.006.patch
>
>
> It appears that if an HDFS space quota is set on a target directory for log 
> aggregation and the quota is already exceeded when log aggregation is 
> attempted, zero-byte log files will be written to the HDFS directory, however 
> NodeManager logs do not reflect a failure to write the files successfully 
> (i.e. there are no ERROR or WARN messages to this effect).
> An improvement may be worth investigating to alert users to this scenario, as 
> otherwise logs for a YARN application may be missing both on HDFS and locally 
> (after local log cleanup is done) and the user may not otherwise be informed.
> Steps to reproduce:
> * Set a small HDFS space quota on /tmp/logs/username/logs (e.g. 2MB)
> * Write files to HDFS such that /tmp/logs/username/logs is almost 2MB full
> * Run a Spark or MR job in the cluster
> * Observe that zero byte files are written to HDFS after job completion
> * Observe that YARN container logs are also not present on the NM hosts (or 
> are deleted after yarn.nodemanager.delete.debug-delay-sec)
> * Observe that no ERROR or WARN messages appear to be logged in the NM role 
> log



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to