[
https://issues.apache.org/jira/browse/YARN-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15166538#comment-15166538
]
Jun Gong commented on YARN-4720:
--------------------------------
Thanks [~mingma] for review and comments.
{quote}
When pendingContainerInThisCycle is empty, NM will skip sending the
LogAggregationReport with LogAggregationStatus.RUNNING. It means for a long
running service, it is possible for a yarn client to get
LogAggregationStatus.NOT_START when it calls
ApplicationClientProtocol#getApplicationReport if the long running service
doesn't generate any log. Without the patch, NM will send
LogAggregationStatus.RUNNING regardless. So it might be better to still send
LogAggregationStatus.RUNNING regardless.
{quote}
Yes, it is a different behavior actually. LogAggregationReport is a report for
current status, is it necessary to send a report if NM has not done log
aggregation actually?
BTW: I noticed that there is no cleanup for previous LogAggregationReport,
there is only 'this.context.getLogAggregationStatusForApps().add()' and no
'remove', is it a deliberate action?
{quote}
When LogWriter creation throws exception and appFinished is true, NM will send
a LogAggregationReport with LogAggregationStatus.SUCCEEDED. Without the patch,
NM won't send any final LogAggregationReport. Maybe it is better to update the
patch to send LogAggregationStatus.FAILED for such scenario.
{quote}
I will update the patch to address it.
> Skip unnecessary NN operations in log aggregation
> -------------------------------------------------
>
> Key: YARN-4720
> URL: https://issues.apache.org/jira/browse/YARN-4720
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Ming Ma
> Assignee: Jun Gong
> Attachments: YARN-4720.01.patch, YARN-4720.02.patch
>
>
> Log aggregation service could have unnecessary NN operations in the following
> scenarios:
> * No new local log has been created since the last upload for the long
> running service scenario.
> * NM uses {{ContainerLogAggregationPolicy}} that skips log aggregation for
> certain containers.
> In the following code snippet, even though {{pendingContainerInThisCycle}} is
> empty, it still creates the writer and then removes the file later. Thus it
> introduces unnecessary create/getfileinfo/delete NN calls when NM doesn't
> aggregate logs for an app.
>
> {noformat}
> AppLogAggregatorImpl.java
> ......
> writer =
> new LogWriter(this.conf, this.remoteNodeTmpLogFileForApp,
> this.userUgi);
> ......
> for (ContainerId container : pendingContainerInThisCycle) {
> ......
> }
> ......
> if (remoteFS.exists(remoteNodeTmpLogFileForApp)) {
> if (rename) {
> remoteFS.rename(remoteNodeTmpLogFileForApp, renamedPath);
> } else {
> remoteFS.delete(remoteNodeTmpLogFileForApp, false);
> }
> }
> ......
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)