[jira] [Commented] (YARN-7952) Find a way to persist the log aggregation status

Wangda Tan (JIRA) Mon, 05 Mar 2018 12:25:18 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-7952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16386700#comment-16386700
 ]


Wangda Tan commented on YARN-7952:
----------------------------------

Thanks [~xgong], 

It might be better to update default value in a separate JIRA since it is not 
directly related to this one.

1) nmLogAggregationStatusTracker should be part of NodeManager.java instead of 
CM.
2) NMLogAggregationStatusTracker:
2.1 It looks like pullLock should be a readLock and updateLocker should be a 
writeLock since pull doesn't change internal data, correct?

2.2 It's better to rename pullLocker/updateLocker to read/writeLock for 
readability.

2.3 For overall workflow of this class, IMO it should be:
a. When log aggregation starts, application will be added to 
{{NMLogAggregationStatusTracker}} with timestamp.
b. When RM acknowledges application finished, application will be removed from 
the {{NMLogAggregationStatusTracker}}
c. If configured timeout reaches, application will be removed from the 
{{NMLogAggregationStatusTracker}}.

I can see c. is handled by {{rollLogAggregationStatus}} and a. is handled by 
{{updateLogAggregationStatus}}, but not sure about b. When I look at logic 
below, I'm not sure if it works:

{code}
          if (currentTime - updateTime > rollingInterval) {
            LOG.warn("Ignore the log aggregation status update request "
                + "for the application:" + appId + ". The log aggregation 
status"
                + " update time is " + updateTime + " while the request process 
"
                + "time is " + currentTime + ".");
            return;
          }
{code}

Since the currentTime always almost equal to updateTime. The if statement 
should be false, correct? And I think if application is null in NM side, we 
should not add it to the track, correct?

Instead of this, should we explicitly notify {{NMLogAggregationStatusTracker}} 
in {{ApplicationImpl.AppLogsAggregatedTransition}}? (It is true that 
{{ApplicationImpl.AppLogsAggregatedTransition}} means RM acks the log is 
finished?)

2.4 I'm not sure if we should ever change the {{lastModifiedTime}}. IIUC, the 
RM will timeout log aggregation status after configured timeout. What's the 
purpose of updating the {{lastModifiedTime}}? Will this cause OOM issue 
potentially? 

2.5 updateLogAggregationStatus: 
- the {{long updateTime}} parameter is not necessary, can always use 
System.currentTime instead.

3) {{LogAggregationTrakcer}}
- {{LogAggregationTrakcer}} => AppLogAggregationStatusForRMRecovery (it's not 
tracker since it is a passive status, and it's better to emphasize purpose of 
this class)
- {{getLastModifiedTime}} should be {{logAggregationStartedTime}} if you agree 
with 2.4

4) Comment of YarnConfiguration.LOG_AGGREGATION_STATUS_TIME_OUT_MS should be 
updated, it will be consumed by NM as well.

Will include more another detailed scan in the next review.

> Find a way to persist the log aggregation status
> ------------------------------------------------
>
>                 Key: YARN-7952
>                 URL: https://issues.apache.org/jira/browse/YARN-7952
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>            Priority: Major
>         Attachments: YARN-7952-poc.patch, YARN-7952.1.patch, YARN-7952.2.patch
>
>
> In MAPREDUCE-6415, we have created a CLI to har the aggregated logs, and In 
> YARN-4946: RM should write out Aggregated Log Completion file flag next to 
> logs, we have a discussion on how we can get the log aggregation status: make 
> a client call to RM or get it directly from the Distributed file system(HDFS).
> No matter which approach we would like to choose, we need to figure out a way 
> to persist the log aggregation status first. This ticket is used to track the 
> working progress for this purpose.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7952) Find a way to persist the log aggregation status

Reply via email to