[
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148834#comment-14148834
]
Zhijie Shen commented on YARN-2468:
-----------------------------------
The patch is generally good. Some minor comments, and puzzles about the code.
1. The first one is \@VisibleForTesting? And the second one is not necessary?
{code}
- private static String getNodeString(NodeId nodeId) {
+ public static String getNodeString(NodeId nodeId) {
return nodeId.toString().replace(":", "_");
}
-
+
+ public static String getNodeString(String nodeId) {
+ return nodeId.replace(":", "_");
+ }
{code}
2. Add a TODO to say the test will be fixed in a in followup Jira, in case we
forget it?
{code}
+ @Ignore
@Test
public void testNoLogs() throws Exception {
{code}
3. Based on my understanding, uploadedFiles is the candidate files to upload?
If so, can we rename the variables and related methods?
{code}
+ private Set<File> uploadedFiles = new HashSet<File>();
{code}
4. I assume this var is going to capture all the existing log files on HDFS,
isn't it? If so, the computation of it seems to be problematic, because it
doesn't exclude the files to be excluded. And what's the effect on
alreadyUploadedLogs?
{code}
+ private Set<String> allExistingFileMeta = new HashSet<String>();
{code}
{code}
Iterable<String> mask =
Iterables.filter(alreadyUploadedLogs, new Predicate<String>() {
@Override
public boolean apply(String next) {
return currentExistingLogFiles.contains(next);
}
});
{code}
5. Make the old LogValue constructor based on the new one?
6. LogValue.write is not necessary to be changed?
7. It's recommended to close the Closable objects via IOUtils, but it seems
that AggregatedLogFormat already has this issue before. Let's file a separate
ticket for it.
{code}
+ if (this.fsDataOStream != null) {
+ this.fsDataOStream.close();
+ }
{code}
8. nodeId seems to be of no use. No need to be passed into AppLogAggregatorImpl.
{code}
+ private final NodeId nodeId;
{code}
9. remoteNodeLogDirForApp doesn't affect remoteNodeTmpLogFileForApp, which only
depends on remoteNodeLogFileForApp. remoteNodeLogFileForApp is determined at
construction, so remoteNodeTmpLogFileForApp should be final and computed once
in constructor as well. And constructor param remoteNodeLogDirForApp should be
renamed back to remoteNodeLogFileForApp.
{code}
- private final Path remoteNodeTmpLogFileForApp;
+ private Path remoteNodeTmpLogFileForApp;
{code}
{code}
- private Path getRemoteNodeTmpLogFileForApp() {
+ private Path getRemoteNodeTmpLogFileForApp(Path remoteNodeLogDirForApp) {
return new Path(remoteNodeLogFileForApp.getParent(),
- (remoteNodeLogFileForApp.getName() + TMP_FILE_SUFFIX));
+ (remoteNodeLogFileForApp.getName() +
LogAggregationUtils.TMP_FILE_SUFFIX));
}
{code}
10. One typo
{code}
// if any of the previous uoloaded logs have been deleted,
{code}
11. One question: if one file is failed at uploading in LogValue.write(),
uploadedFiles will not reflect the missing uploaded file, and it will not be
uploaded again?
> Log handling for LRS
> --------------------
>
> Key: YARN-2468
> URL: https://issues.apache.org/jira/browse/YARN-2468
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: log-aggregation, nodemanager, resourcemanager
> Reporter: Xuan Gong
> Assignee: Xuan Gong
> Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch,
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch,
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch,
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch,
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch,
> YARN-2468.7.1.patch, YARN-2468.7.patch
>
>
> Currently, when application is finished, NM will start to do the log
> aggregation. But for Long running service applications, this is not ideal.
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a
> single file. The files could become larger and larger.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)