[ 
https://issues.apache.org/jira/browse/YARN-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16382612#comment-16382612
 ] 

Wangda Tan commented on YARN-7891:
----------------------------------

Thanks [~xgong]

*Overall questions:*

What's the JIRA about? I'm trying to review the patch, but it looks like 
existing code (see below) already handled har file, so what's new added by the 
patch? Could you elaborate this in description?
{code:java}
    for (FileStatus file : files) {
      String nodeName = file.getPath().getName();
      if ((nodeId == null || nodeId.isEmpty()
          || nodeName.contains(LogAggregationUtils
          .getNodeString(nodeId))) && !nodeName.endsWith(
              LogAggregationUtils.TMP_FILE_SUFFIX) &&
          !nodeName.endsWith(CHECK_SUM_FILE_SUFFIX)) {
        if (nodeName.equals(appId + ".har")) {
          Path p = new Path("har:///" + file.getPath().toUri().getRawPath());
          files = Arrays.asList(HarFs.get(p.toUri(), conf).listStatus(p));
          continue;
        }
        listOfFiles.add(file);
      }
    }
{code}
And this looks a bit confusing to me:
{code:java}
 
files = Arrays.asList(HarFs.get(p.toUri(), conf).listStatus(p));
{code}
Instead of replacing the iterating var, should we call 
listOfFiles.add(HarFs.get(p.toUri(), conf).listStatus(p)) instead? What if 
there're more files apart from the .har file? If it is not possible, should we 
check this and throw exception if other files exist in the same folder of .har?

*Minor:*

1) In {{readAggregatedLogs}}/{{readAggregatedLogsMeta}}, following logics are 
mostly identical, could you merge common logics if possible?
{code:java}
    RemoteIterator<FileStatus> nodeFiles = LogAggregationUtils
        .getRemoteNodeFileDir(conf, appId, appOwner, this.remoteRootLogDir,
        this.remoteRootLogDirSuffix);
    if (!nodeFiles.hasNext()) {
      throw new IOException("There is no available log fils for "
          + "application:" + appId);
    }
    List<FileStatus> allFiles = getAllNodeFiles(nodeFiles, appId);
    if (allFiles.isEmpty()) {
      throw new IOException("There is no available log fils for "
          + "application:" + appId);
    }
    Map<String, Long> checkSumFiles = parseCheckSumFiles(allFiles);
    List<FileStatus> fileToRead = getNodeLogFileToRead(
        allFiles, nodeIdStr, appId);
{code}

> LogAggregationIndexedFileController should support HAR file
> -----------------------------------------------------------
>
>                 Key: YARN-7891
>                 URL: https://issues.apache.org/jira/browse/YARN-7891
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>            Priority: Major
>         Attachments: YARN-7891.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to