[ 
https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111886#comment-16111886
 ] 

Wangda Tan commented on YARN-6875:
----------------------------------

[~jlowe], [~xgong],
I'm thinking this issue, probably we can create a local index file instead of 
remote index file to void extra overload to NN.

Do you think if following solution is reasonable:
- Local log aggregator always maintain a separate confirmed index file on 
*local dir* 
- When we need to do partial log aggregation, we always read the local index 
file, and replace it once partial log aggregation finishes. 
- For the under-appending file, we will try to load local index file. (I think 
this is possible).
- If appending fails, and NM will retry, we will follow the same logic above. 
- If appending fails, and NM is alive and will not retry, it will append index 
file to the remote file. 
- If appending fails, and NM is not alive, it follow Jason's logic to scan 
where's the last index. This should be rare.

Hope to hear your thoughts.

> New aggregated log file format for YARN log aggregation.
> --------------------------------------------------------
>
>                 Key: YARN-6875
>                 URL: https://issues.apache.org/jira/browse/YARN-6875
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>         Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have 
> seen several performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large 
> log files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to