[
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105859#comment-16105859
]
Xuan Gong commented on YARN-6875:
---------------------------------
Thanks for the comments. [~jlowe]. I fully understand your consideration. But,
bq. I'm not a big fan of having a separate file, even temporarily, because log
aggregation can already be a large portion of the namenode's write load on
large clusters. Having that separate file will increase the namenode write load
significantly (approximately 2x per log aggregation cycle if I understand it
correctly).
I agree with this. But the proposed solution will not be worse than current
solution (TFile). Also, the index file will be created only when the partially
log aggregation is enabled.
If we enable partially log aggregation:
* For T-File solution (currently used), we would create a new file every time
we do the log aggregation. If we have done log aggregation three times, we
would have three T-Files
* For the proposed solution, at most, we would have two files: the log file and
index file.
bq. Note that the separate index file doesn't solve all the race conditions for
the reader.
Yes, this corn case is valid. But I think that this is OK. The reader would
fail in this case, but we can always retry the reader later.
> New aggregated log file format for YARN log aggregation.
> --------------------------------------------------------
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
> Issue Type: New Feature
> Reporter: Xuan Gong
> Assignee: Xuan Gong
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large
> log files.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]