[
https://issues.apache.org/jira/browse/YARN-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13805907#comment-13805907
]
Vinod Kumar Vavilapalli commented on YARN-975:
----------------------------------------------
Had a look at the patch, some comments:
- HDFS jar is not needed as a test-dependency in
hadoop-yarn-server-applicationhistoryservice/pom.xml
- You should make HistoryFileReader and HistoryFileWriter as static private
inner classes and avoid sharing state altogether.
- Wrap new
ApplicationStartDataPBImpl(ApplicationStartDataProto.parseFrom(entry.value)))
into a method in ApplicationStartData. Similarly others.
- getApplicationAttempts(): If there is no history-file, we should throw a
valid exception?
- finishDtata: Typo
- There is no limit on outstandingWriters. If RM runs 1K applications in
parallel, we'll have 1K writers - RM can thus potentially go out of file
handles. We need to limit this (configurable?) and queue any more writes into a
limited number of threads. Can do in a follow up JIRA, please file one.
- appId + START_DATA_SUFFIX: Instead of strings and appending, you can write a
complex key which has an ApplicationId and the start marker and convert them to
bytes when storing via a getBytes() method.
- Similarly for ApplicationAttempt and Container suffixes.
- When a HistoryFile exists, HistoryFileWriter should open it in append mode.
- In both the reader and the writer, you should use IOUtils.cleanup() instead
of explicitly calling close on each stream yourselves everywhere.
- Don't think we should do this. Any retries should be inside
FileSystemHistoryStore. We should close the writer in a finally block.
{code}
+ // Not put close() in finally block in case callers want to retry writing+
// the data. On the other hand, the file will anyway be close when the
+ // store is stopped.
{code}
- Dismantle retriveStartFinishData() into two methods - one for start and one
for finish.
- TestApplicationHistoryStore was renamed in YARN-956, please update the patch
- Test: A single file will only have data about a single application. So
testWriteHistoryData() should not have multiple applications. Similarly
ApplicationAttempt finish to follow after container-finish.
- Test: We should NOT have this dependency. Java 7 reorders tests in some
cases.
{code}
+ // The order of the test cases matters
{code}
> Add a file-system implementation for history-storage
> ----------------------------------------------------
>
> Key: YARN-975
> URL: https://issues.apache.org/jira/browse/YARN-975
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Zhijie Shen
> Assignee: Zhijie Shen
> Attachments: YARN-975.10.patch, YARN-975.1.patch, YARN-975.2.patch,
> YARN-975.3.patch, YARN-975.4.patch, YARN-975.5.patch, YARN-975.6.patch,
> YARN-975.7.patch, YARN-975.8.patch, YARN-975.9.patch
>
>
> HDFS implementation should be a standard persistence strategy of history
> storage
--
This message was sent by Atlassian JIRA
(v6.1#6144)