[ 
https://issues.apache.org/jira/browse/YARN-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157273#comment-15157273
 ] 

Jason Lowe commented on YARN-4705:
----------------------------------

bq. adding a check to skip 0-byte files stops the stack I've been seeing from 
coming back...

The issue with checking the file size first is that the ATS could refuse to 
open a file that has data in it, adding extra delay between the application 
writing the data and the data appearing in the ATS.  Since we're trying to read 
a file that is still being written the file sizes may not reflect reality, 
especially in file systems like HDFS where the file size is only updated when 
new blocks are allocated.

> ATS 1.5 parse pipeline to consider handling open() events recoverably
> ---------------------------------------------------------------------
>
>                 Key: YARN-4705
>                 URL: https://issues.apache.org/jira/browse/YARN-4705
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> During one of my own timeline test runs, I've been seeing a stack trace 
> warning that the CRC check failed in Filesystem.open() file; something the FS 
> was ignoring.
> Even though its swallowed (and probably not the cause of my test failure), 
> looking at the code in {{LogInfo.parsePath()}} that it considers a failure to 
> open a file as unrecoverable. 
> on some filesystems, this may not be the case, i.e. if its open for writing 
> it may not be available for reading; checksums maybe a similar issue. 
> Perhaps a failure at open() should be viewed as recoverable while the app is 
> still running?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to