[ 
https://issues.apache.org/jira/browse/YARN-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth reassigned YARN-4946:
------------------------------------

    Assignee: Szilard Nemeth

> RM should write out Aggregated Log Completion file flag next to logs
> --------------------------------------------------------------------
>
>                 Key: YARN-4946
>                 URL: https://issues.apache.org/jira/browse/YARN-4946
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: log-aggregation
>    Affects Versions: 2.8.0
>            Reporter: Robert Kanter
>            Assignee: Szilard Nemeth
>            Priority: Major
>
> MAPREDUCE-6415 added a tool that combines the aggregated log files for each 
> Yarn App into a HAR file.  When run, it seeds the list by looking at the 
> aggregated logs directory, and then filters out ineligible apps.  One of the 
> criteria involves checking with the RM that an Application's log aggregation 
> status is not still running and has not failed.  When the RM "forgets" about 
> an older completed Application (e.g. RM failover, enough time has passed, 
> etc), the tool won't find the Application in the RM and will just assume that 
> its log aggregation succeeded, even if it actually failed or is still running.
> We can solve this problem by doing the following:
> # When the RM sees that an Application has successfully finished aggregation 
> its logs, it will write a flag file next to that Application's log files
> # The tool no longer talks to the RM at all.  When looking at the FileSystem, 
> it now uses that flag file to determine if it should process those log files. 
>  If the file is there, it archives, otherwise it does not.
> # As part of the archiving process, it will delete the flag file
> # (If you don't run the tool, the flag file will eventually be cleaned up by 
> the JHS when it cleans up the aggregated logs because it's in the same 
> directory)
> This improvement has several advantages:
> # The edge case about "forgotten" Applications is fixed
> # The tool no longer has to talk to the RM; it only has to consult HDFS.  
> This is simpler



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to