[ 
https://issues.apache.org/jira/browse/YARN-2240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14050580#comment-14050580
 ] 

Mit Desai commented on YARN-2240:
---------------------------------

Aggregated Logs Comment

[~vinodkv], here is the error on which it fails.

{noformat}
2014-06-10 22:06:34,940 [LogAggregationService #1922] ERROR
logaggregation.AggregatedLogFormat: Error aggregating log file. Log file
:
/grid/0/tmp/yarn-logs/application_1401475649625_135179/container_1401475649625_135179_01_000001/history.txt.appattempt_1401475649625_135179_000001/grid/0/tmp/yarn-logs/application_1401475649625_135179/container_1401475649625_135179_01_000001/history.txt.appattempt_1401475649625_135179_000001
(Permission denied)
java.io.FileNotFoundException:
/grid/0/tmp/yarn-logs/application_1401475649625_135179/container_1401475649625_135179_01_000001/history.txt.appattempt_1401475649625_135179_000001
(Permission denied)
         at java.io.FileInputStream.open(Native Method)
         at java.io.FileInputStream.<init>(FileInputStream.java:138)
         at
org.apache.hadoop.io.SecureIOUtils.forceSecureOpenForRead(SecureIOUtils.java:215)
         at
org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:204)
         at
org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogValue.write(AggregatedLogFormat.java:196)
         at
org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter.append(AggregatedLogFormat.java:311)
         at
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainer(AppLogAggregatorImpl.java:130)
         at
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:166)
         at
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:140)
         at
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$2.run(LogAggregationService.java:354)
         at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
         at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
         at java.lang.Thread.run(Thread.java:722)
{noformat}


I managed to get into the logs and found that the length for the logs it was 
reporting was 111K and the corrupted aggregated would read something like this.
The portion of the aggregated logs where there is the problem is here.
{noformat}
[...]
LogType: history.txt.appattempt_1401475649625_135179_000001
LogLength: 111686
Log Contents:
Error aggregating log file. Log file :
/grid/0/tmp/yarn-logs/application_1401475649625_135179/container_1401475649625_135179_01_000001/history.txt.appattempt_1401475649625_135179_000001/grid/0/tmp/yarn-logs/application_1401475649625_135179/container_1401475649625_135179_01_000001/history.txt.appattempt_1401475649625_135179_000001
(Permission
denied)stderr0!stderr_dag_1401475649625_135179_10&stderr_dag_1401475649625_135179_1_post0stdout0!stdout_dag_1401475649625_135179_10&stdout_dag_1401475649625_135179_1_post0syslog102042014-06-10
22:05:58,519 INFO [main] org.apache.tez.dag.app.DAGAppMaster: Created
DAGAppMaster for application appattempt_1401475649625_135179_000001
[...]
{noformat}

> yarn logs can get corrupted if the aggregator does not have permissions to 
> the log file it tries to read
> --------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-2240
>                 URL: https://issues.apache.org/jira/browse/YARN-2240
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.5.0
>            Reporter: Mit Desai
>
> When the log aggregator is aggregating the logs, it writes the file length 
> first. Then tries to open the log file and if it does not have permission to 
> do that, it ends up just writing an error message to the aggregated logs.
> The mismatch between the file length and the actual length here makes the 
> aggregated logs corrupted.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to