Sunil G commented on YARN-3476:

HI [~jlowe] and [~rohithsharma]

A retention logic to handle this error may become more complex when multiple 
failures seen during aggretion across application. If this happens rarely, a 
strong retention logic with  a timer s helpful.

On a generic level, by considering more failures, a clean up after aggression 
can save the disk. Which s acceptable as we encountered error and there may not 
be real pressure to give 100% good logs with an error while aggretion.

> Nodemanager can fail to delete local logs if log aggregation fails
> ------------------------------------------------------------------
>                 Key: YARN-3476
>                 URL: https://issues.apache.org/jira/browse/YARN-3476
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: log-aggregation, nodemanager
>    Affects Versions: 2.6.0
>            Reporter: Jason Lowe
>            Assignee: Rohith
>         Attachments: 0001-YARN-3476.patch
> If log aggregation encounters an error trying to upload the file then the 
> underlying TFile can throw an illegalstateexception which will bubble up 
> through the top of the thread and prevent the application logs from being 
> deleted.

This message was sent by Atlassian JIRA

Reply via email to