[
https://issues.apache.org/jira/browse/YARN-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13831187#comment-13831187
]
Sandy Ryza commented on YARN-1440:
----------------------------------
Would it be helpful for YARN to supply a public API that reads the files for
you?
> Yarn aggregated logs should be stored in a simpler format
> ---------------------------------------------------------
>
> Key: YARN-1440
> URL: https://issues.apache.org/jira/browse/YARN-1440
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: ledion bitincka
> Labels: log-aggregation, logs, tfile, yarn
>
> The log aggregation feature in Yarn is awesome! However, the file type and
> format in which the log files are aggregated into (TFile) should either be
> much simpler or be made pluggable. The current TFile format forces anyone who
> wants to see the files to either
> a) use the web UI
> b) use the CLI tools (yarn logs) or
> c) write custom code to read the files
> My suggestion would be to simplify the log collection by collecting and
> writing the raw log files into a directory structure as follows:
> {noformat}
> /{log-collection-dir}/{app-id}/{container-id}/{log-file-name}
> {noformat}
> This way the application developers can (re)use a much wider array of tools
> to process the logs.
> For the readers who are not familiar with logs and their format you can find
> more info the following two blog posts:
> http://hortonworks.com/blog/simplifying-user-logs-management-and-access-in-yarn/
> http://blogs.splunk.com/2013/11/18/hadoop-2-0-rant/
--
This message was sent by Atlassian JIRA
(v6.1#6144)