[ 
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148834#comment-14148834
 ] 

Zhijie Shen commented on YARN-2468:
-----------------------------------

The patch is generally good. Some minor comments, and puzzles about the code.

1. The first one is \@VisibleForTesting? And the second one is not necessary?
{code}
-  private static String getNodeString(NodeId nodeId) {
+  public static String getNodeString(NodeId nodeId) {
     return nodeId.toString().replace(":", "_");
   }
-  
+
+  public static String getNodeString(String nodeId) {
+    return nodeId.replace(":", "_");
+  }
{code}

2. Add a TODO to say the test will be fixed in a in followup Jira, in case we 
forget it?
{code}
+  @Ignore
   @Test
   public void testNoLogs() throws Exception {
{code}

3. Based on my understanding, uploadedFiles is the candidate files to upload? 
If so, can we rename the variables and related methods?
{code}
+    private Set<File> uploadedFiles = new HashSet<File>();
{code}

4. I assume this var is going to capture all the existing log files on HDFS, 
isn't it? If so, the computation of it seems to be problematic, because it 
doesn't exclude the files to be excluded. And what's the effect on 
alreadyUploadedLogs?
{code}
+    private Set<String> allExistingFileMeta = new HashSet<String>();
{code}
{code}
      Iterable<String> mask =
          Iterables.filter(alreadyUploadedLogs, new Predicate<String>() {
            @Override
            public boolean apply(String next) {
              return currentExistingLogFiles.contains(next);
            }
          });
{code}

5. Make the old LogValue constructor based on the new one?

6. LogValue.write is not necessary to be changed?

7. It's recommended to close the Closable objects via IOUtils, but it seems 
that AggregatedLogFormat already has this issue before. Let's file a separate 
ticket for it.
{code}
+        if (this.fsDataOStream != null) {
+          this.fsDataOStream.close();
+        }
{code}

8. nodeId seems to be of no use. No need to be passed into AppLogAggregatorImpl.
{code}
+  private final NodeId nodeId;
{code}

9. remoteNodeLogDirForApp doesn't affect remoteNodeTmpLogFileForApp, which only 
depends on remoteNodeLogFileForApp. remoteNodeLogFileForApp is determined at 
construction, so remoteNodeTmpLogFileForApp should be final and computed once 
in constructor as well. And constructor param remoteNodeLogDirForApp should be 
renamed back to remoteNodeLogFileForApp.
{code}
-  private final Path remoteNodeTmpLogFileForApp;
+  private Path remoteNodeTmpLogFileForApp;
{code}
{code}
-  private Path getRemoteNodeTmpLogFileForApp() {
+  private Path getRemoteNodeTmpLogFileForApp(Path remoteNodeLogDirForApp) {
     return new Path(remoteNodeLogFileForApp.getParent(),
-        (remoteNodeLogFileForApp.getName() + TMP_FILE_SUFFIX));
+      (remoteNodeLogFileForApp.getName() + 
LogAggregationUtils.TMP_FILE_SUFFIX));
   }
{code}

10. One typo
{code}
      // if any of the previous uoloaded logs have been deleted,
{code}

11. One question: if one file is failed at uploading in LogValue.write(), 
uploadedFiles will not reflect the missing uploaded file, and it will not be 
uploaded again?

> Log handling for LRS
> --------------------
>
>                 Key: YARN-2468
>                 URL: https://issues.apache.org/jira/browse/YARN-2468
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: log-aggregation, nodemanager, resourcemanager
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>         Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, 
> YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, 
> YARN-2468.4.patch, YARN-2468.5.1.patch, YARN-2468.5.1.patch, 
> YARN-2468.5.2.patch, YARN-2468.5.3.patch, YARN-2468.5.4.patch, 
> YARN-2468.5.patch, YARN-2468.6.1.patch, YARN-2468.6.patch, 
> YARN-2468.7.1.patch, YARN-2468.7.patch
>
>
> Currently, when application is finished, NM will start to do the log 
> aggregation. But for Long running service applications, this is not ideal. 
> The problems we have are:
> 1) LRS applications are expected to run for a long time (weeks, months).
> 2) Currently, all the container logs (from one NM) will be written into a 
> single file. The files could become larger and larger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to