[ https://issues.apache.org/jira/browse/YARN-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075106#comment-15075106 ]
tangshangwen commented on YARN-4530: ------------------------------------ when the assoc is null and the completed.get() throw a ExecutionException,This will happen, right? {code:title=ResourceLocalizationService.java|borderStyle=solid} try { Future<Path> completed = queue.take(); LocalizerResourceRequestEvent assoc = pending.remove(completed); try { Path local = completed.get(); if (null == assoc) { LOG.error("Localized unkonwn resource to " + completed); // TODO delete return; } LocalResourceRequest key = assoc.getResource().getRequest(); publicRsrc.handle(new ResourceLocalizedEvent(key, local, FileUtil .getDU(new File(local.toUri())))); assoc.getResource().unlock(); } catch (ExecutionException e) { LOG.info("Failed to download rsrc " + assoc.getResource(), e.getCause()); LocalResourceRequest req = assoc.getResource().getRequest(); publicRsrc.handle(new ResourceFailedLocalizationEvent(req, e.getMessage())); assoc.getResource().unlock(); } catch (CancellationException e) { // ignore; shutting down } {code} > LocalizedResource trigger a NPE Cause the NodeManager exit > ---------------------------------------------------------- > > Key: YARN-4530 > URL: https://issues.apache.org/jira/browse/YARN-4530 > Project: Hadoop YARN > Issue Type: Bug > Affects Versions: 2.2.0 > Reporter: tangshangwen > > In our cluster, I found that LocalizedResource download failed trigger a NPE > Cause the NodeManager shutdown. > {noformat} > 2015-12-29 17:18:33,706 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > hdfs://ns3:8020/user/username/projects/user_insight/lookalike/oozie/workflow/conf/hive-site.xml > transitioned from DOWNLOADING to FAILED > 2015-12-29 17:18:33,708 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Downloading public rsrc:{ > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/user_insight_pig_udf-0.0.1-SNAPSHOT-jar-with-dependencies.jar, > 1451380519635, FILE, null } > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Failed to download rsrc { { > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar, > 1451380519452, FILE, null > },pending,[(container_1451039893865_261670_01_000578)],42332661980495938,DOWNLOADING} > java.io.IOException: Resource > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar > changed on src filesystem (expected 1451380519452, was 1451380611793 > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:276) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar > transitioned from DOWNLOADING to FAILED > 2015-12-29 17:18:33,710 FATAL > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Error: Shutting down > java.lang.NullPointerException at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:712) > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Public cache exiting > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)