tangshangwen created YARN-4530:
----------------------------------
Summary: LocalizedResource trigger a NPE Cause the NodeManager exit
Key: YARN-4530
URL: https://issues.apache.org/jira/browse/YARN-4530
Project: Hadoop YARN
Issue Type: Bug
Affects Versions: 2.2.0
Reporter: tangshangwen
In our cluster, I found that LocalizedResource download failed trigger a NPE
Cause the NodeManager shutdown.
{noformat}
2015-12-29 17:18:33,706 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
hdfs://ns3:8020/user/username/projects/user_insight/lookalike/oozie/workflow/conf/hive-site.xml
transitioned from DOWNLOADING to FAILED
2015-12-29 17:18:33,708 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Downloading public rsrc:{
hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/user_insight_pig_udf-0.0.1-SNAPSHOT-jar-with-dependencies.jar,
1451380519635, FILE, null }
2015-12-29 17:18:33,710 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Failed to download rsrc { {
hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar,
1451380519452, FILE, null
},pending,[(container_1451039893865_261670_01_000578)],42332661980495938,DOWNLOADING}
java.io.IOException: Resource
hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar
changed on src filesystem (expected 1451380519452, was 1451380611793
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:276)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2015-12-29 17:18:33,710 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar
transitioned from DOWNLOADING to FAILED
2015-12-29 17:18:33,710 FATAL
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Error: Shutting down
java.lang.NullPointerException at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:712)
2015-12-29 17:18:33,710 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Public cache exiting
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)