[
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111886#comment-16111886
]
Wangda Tan commented on YARN-6875:
----------------------------------
[~jlowe], [~xgong],
I'm thinking this issue, probably we can create a local index file instead of
remote index file to void extra overload to NN.
Do you think if following solution is reasonable:
- Local log aggregator always maintain a separate confirmed index file on
*local dir*
- When we need to do partial log aggregation, we always read the local index
file, and replace it once partial log aggregation finishes.
- For the under-appending file, we will try to load local index file. (I think
this is possible).
- If appending fails, and NM will retry, we will follow the same logic above.
- If appending fails, and NM is alive and will not retry, it will append index
file to the remote file.
- If appending fails, and NM is not alive, it follow Jason's logic to scan
where's the last index. This should be rare.
Hope to hear your thoughts.
> New aggregated log file format for YARN log aggregation.
> --------------------------------------------------------
>
> Key: YARN-6875
> URL: https://issues.apache.org/jira/browse/YARN-6875
> Project: Hadoop YARN
> Issue Type: New Feature
> Reporter: Xuan Gong
> Assignee: Xuan Gong
> Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large
> log files.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]