Adam Antal created YARN-9525:
--------------------------------

             Summary: TFile format is not working against s3a remote folder
                 Key: YARN-9525
                 URL: https://issues.apache.org/jira/browse/YARN-9525
             Project: Hadoop YARN
          Issue Type: Bug
          Components: log-aggregation
    Affects Versions: 3.1.2
            Reporter: Adam Antal


Using the IndexedFileFormat {{yarn.nodemanager.remote-app-log-dir}} configured 
to an s3a URI throws the following exception during log aggregation:

{noformat}
Cannot create writer for app application_1556199768861_0001. Skip log upload 
this time. 
java.io.IOException: java.io.FileNotFoundException: No such file or directory: 
s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
        at 
org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:247)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:306)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:464)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:420)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$1.run(LogAggregationService.java:276)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: No such file or directory: 
s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2488)
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2382)
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2321)
        at 
org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:128)
        at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1244)
        at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1240)
        at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
        at org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1246)
        at 
org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController$1.run(LogAggregationIndexedFileController.java:228)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
        at 
org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:195)
        ... 7 more
{noformat}

This stack trace point to 
{{LogAggregationIndexedFileController$initializeWriter}} where we do the 
following steps (in a non-rolling log aggregation setup):
- create FSDataOutputStream
- writing out a UUID
- flushing
- immediately after that we call a GetFileStatus to get the length of the log 
file (the bytes we just wrote out), and that's where the failures happens: the 
file is not there yet due to eventual consistency.

Maybe we can get rid of that, so we can use IFile format against a s3a target.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to