Karthik Kambatla created YARN-3793:
--------------------------------------
Summary: Several NPEs when deleting local files on NM recovery
Key: YARN-3793
URL: https://issues.apache.org/jira/browse/YARN-3793
Project: Hadoop YARN
Issue Type: Bug
Components: nodemanager
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Critical
When NM work-preserving restart is enabled, we see several NPEs on recovery.
These seem to correspond to sub-directories that need to be deleted. I wonder
if null pointers here mean incorrect tracking of these resources and a
potential leak. This JIRA is to investigate and fix anything required.
Logs show:
{noformat}
2015-05-18 07:06:10,225 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting
absolute path : null
2015-05-18 07:06:10,224 ERROR
org.apache.hadoop.yarn.server.nodemanager.DeletionService: Exception during
execution of task in DeletionService
java.lang.NullPointerException
at
org.apache.hadoop.fs.FileContext.fixRelativePart(FileContext.java:274)
at org.apache.hadoop.fs.FileContext.delete(FileContext.java:755)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.deleteAsUser(DefaultContainerExecutor.java:458)
at
org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:293)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)