[ 
https://issues.apache.org/jira/browse/YARN-8418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16550250#comment-16550250
 ] 

Bibin A Chundatt edited comment on YARN-8418 at 7/20/18 5:09 AM:
-----------------------------------------------------------------

[~leftnoteasy]

Let me impact analysis too as part of this.

As part of YARN-4984 we disabled the thread / not sure was a leak.. Which could 
cause the following issue.
 Disabled thread  as part of YARN-4984 is responsible for deletion and clean up 
of logs.
 This means if due to the any issue application log init Fails, aggregation 
will be disabled and deletion of logs which could lead to faulty NM's.

*Scenario example:*

Lets consider we have a long running application . {{Restart the nodemanager 
after seven days(delegation token will expire)/HDFS was in safemode during NM 
start up, log aggregation init fails}}. YARN-4984 disabled the thread 
(*AppLogAggregatorImpl*) which is responsible for clean up too.

Now the long running job or multiple short jobs could cause NMlog dir to fill 
up eventually leading to bad nodemanager.

Please refer YARN-4096.

 


was (Author: bibinchundatt):
[~leftnoteasy]

Let me impact analysis too as part of this.

As part of YARN-4984 we disabled the thread / not sure was a leak.. Which could 
cause the following issue.
 Disabled thread what was as part of YARN-4984 is responsible for deletion and 
clean up of logs.
 This means if due to the any issue application log init Fails, aggregation 
will be disabled and deletion of logs which could lead to faulty NM's.

*Scenario example:*

Lets consider we have a long running application . {{Restart the nodemanager 
after seven days(delegation token will expire)/HDFS was in safemode during NM 
start up, log aggregation init fails}}. YARN-4984 disabled the thread 
(*AppLogAggregatorImpl*) which is responsible for clean up too.

Now the long running job or multiple short jobs could cause NMlog dir to fill 
up eventually leading to bad nodemanager.

Please refer YARN-4096.

 

> App local logs could leaked if log aggregation fails to initialize for the app
> ------------------------------------------------------------------------------
>
>                 Key: YARN-8418
>                 URL: https://issues.apache.org/jira/browse/YARN-8418
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.8.0, 3.0.0-alpha1
>            Reporter: Bibin A Chundatt
>            Assignee: Bibin A Chundatt
>            Priority: Critical
>         Attachments: YARN-8418.001.patch, YARN-8418.002.patch, 
> YARN-8418.003.patch, YARN-8418.004.patch
>
>
> If log aggregation fails init createApp directory container logs could get 
> leaked in NM directory
> For log running application restart of NM after token renewal this case is 
> possible/  Application submission with invalid delegation token



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to