[ 
https://issues.apache.org/jira/browse/YARN-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187250#comment-15187250
 ] 

Jun Gong commented on YARN-4773:
--------------------------------

Thanks [~jlowe] for reporting the issue. 

YARN-4720 tries to skip unnecessary NN operations for every call 
*AppLogAggregatorImpl#uploadLogsForContainers* if pendingContainerInThisCycle 
is empty. IIUC [~jlowe] means this case:  rolling log aggregation is disabled, 
when app completes, we call 
*AppLogAggregatorImpl#uploadLogsForContainers(true)*, and 
*pendingContainerInThisCycle* is not empty, then we will call 
*AppLogAggregatorImpl#cleanOldLogs*, however we do not need call 
*AppLogAggregatorImpl#cleanOldLogs* because there have been no containers' logs 
uploaded before.

> Log aggregation performs extraneous filesystem operations when rolling log 
> aggregation is disabled
> --------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4773
>                 URL: https://issues.apache.org/jira/browse/YARN-4773
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.6.0
>            Reporter: Jason Lowe
>            Priority: Minor
>
> I noticed when log aggregation occurs for an application the nodemanager is 
> listing the application's log directory in HDFS.  Apparently this is for 
> removing old logs before uploading new ones.  This is a wasteful operation 
> when rolling log aggregation is disabled, since there will be no prior logs 
> in HDFS -- aggregation only occurs once when rolling log aggregation is 
> disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to