Zhijie Shen commented on YARN-2703:

[~xgong], thanks for the patch. Some comments about it.

1. It's better to write timestamp directly. When reading it, it's flexible to 
convert it into the desired format we what.
        // Write the uploaded TimeStamp

2. Is it necessary to sort the files? The goal here is add the timestamp to the 
same log file name uploaded in different iterations. The following is make the 
order of the uploaded files in the same iteration changed? Previously, it‘s 
alphabetical, while now it's chronological. For example, stderr1 -> stdout1 -> 
stderr2 -> stdout2 will be changed to stderr1 -> stdout1 -> stdout2 -> stderr2, 
which may not be a better order.
      // sort the file by lastModfiedTime.
      List<File> candidatesList = new ArrayList<File>(candidates);
      Collections.sort(candidatesList, new Comparator<File>() {
        public int compare(File s1, File s2) {
          return s1.lastModified() < s2.lastModified() ? -1
              : s1.lastModified() > s2.lastModified() ? 1 : 0;
      return candidatesList;

3. No need to ask caller to pass in the uploaded time. We can directly execute 
    public void write(DataOutputStream out, Set<File> pendingUploadFiles,
        long uploadedTime) throws IOException {

4. Can you correct the log message bellow in TestLogAggregationService, and add 
logTime as well?

            LOG.info("LogType:" + fileType);
            LOG.info("LogType:" + fileLength);

> Add logUploadedTime into LogValue for better display
> ----------------------------------------------------
>                 Key: YARN-2703
>                 URL: https://issues.apache.org/jira/browse/YARN-2703
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>         Attachments: YARN-2703.1.patch, YARN-2703.2.patch
> Right now, the container can upload its logs multiple times. Sometimes, 
> containers write different logs into the same log file.  After the log 
> aggregation, when we query those logs, it will show:
> LogType: stderr
> LogContext:
> LogType: stdout
> LogContext:
> LogType: stderr
> LogContext:
> LogType: stdout
> LogContext:
> The same files could be displayed multiple times. But we can not figure out 
> which logs come first. We could add extra loguploadedTime to let users have 
> better understanding on the logs.

This message was sent by Atlassian JIRA

Reply via email to