Jason Lowe commented on YARN-3793:

It sounds like the NPEs are scary in the logs but benign in practice, since 
they occur in situations where we don't actually want to delete anything anyway.

Regarding loss of logs, I agree with your analysis.  Makes me think there 
should be a getLogDirsForRead that can be used for places to search for files 
that are already there.  The NPE and the log loss are unrelated, so arguably 
the blocker of log loss should be tracked in a separate JIRA.

> Several NPEs when deleting local files on NM recovery
> -----------------------------------------------------
>                 Key: YARN-3793
>                 URL: https://issues.apache.org/jira/browse/YARN-3793
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.6.0
>            Reporter: Karthik Kambatla
>            Assignee: Varun Saxena
>            Priority: Critical
>         Attachments: YARN-3793.01.patch
> When NM work-preserving restart is enabled, we see several NPEs on recovery. 
> These seem to correspond to sub-directories that need to be deleted. I wonder 
> if null pointers here mean incorrect tracking of these resources and a 
> potential leak. This JIRA is to investigate and fix anything required.
> Logs show:
> {noformat}
> 2015-05-18 07:06:10,225 INFO 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting 
> absolute path : null
> 2015-05-18 07:06:10,224 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.DeletionService: Exception during 
> execution of task in DeletionService
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.fs.FileContext.fixRelativePart(FileContext.java:274)
>         at org.apache.hadoop.fs.FileContext.delete(FileContext.java:755)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.deleteAsUser(DefaultContainerExecutor.java:458)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:293)
> {noformat}

This message was sent by Atlassian JIRA

Reply via email to