[
https://issues.apache.org/jira/browse/YARN-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16980094#comment-16980094
]
Hudson commented on YARN-9968:
------------------------------
SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17688 (See
[https://builds.apache.org/job/Hadoop-trunk-Commit/17688/])
YARN-9968. Public Localizer is exiting in NodeManager due to (snemeth: rev
4c1a1287bc58390900ba1c79818d3ba491c4862c)
* (edit)
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
> Public Localizer is exiting in NodeManager due to NullPointerException
> ----------------------------------------------------------------------
>
> Key: YARN-9968
> URL: https://issues.apache.org/jira/browse/YARN-9968
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 3.1.0
> Reporter: Tarun Parimi
> Assignee: Tarun Parimi
> Priority: Major
> Fix For: 3.3.0, 3.2.2, 3.1.4
>
> Attachments: YARN-9968.001.patch
>
>
> The Public Localizer is encountering a NullPointerException and exiting.
> {code:java}
> ERROR localizer.ResourceLocalizationService
> (ResourceLocalizationService.java:run(995)) - Error: Shutting down
> java.lang.NullPointerException
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:981)
> INFO localizer.ResourceLocalizationService
> (ResourceLocalizationService.java:run(997)) - Public cache exiting
> {code}
> The NodeManager still keeps on running. Subsequent localization events for
> containers keep encountering the below error, resulting in failed
> Localization of all new containers.
> {code:java}
> ERROR localizer.ResourceLocalizationService
> (ResourceLocalizationService.java:addResource(920)) - Failed to submit rsrc {
> { hdfs://namespace/raw/user/.staging/job/conf.xml 1572071824603, FILE, null
> },pending,[(container_e30_1571858463080_48304_01_000134)],12513553420029113,FAILED}
> for download. Either queue is full or threadpool is shutdown.
> java.util.concurrent.RejectedExecutionException: Task
> java.util.concurrent.ExecutorCompletionService$QueueingFuture@55c7fa21
> rejected from
> org.apache.hadoop.util.concurrent.HadoopThreadPoolExecutor@46067edd[Terminated,
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks =
> 382286]
> at
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
> at
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
> at
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
> at
> java.util.concurrent.ExecutorCompletionService.submit(ExecutorCompletionService.java:181)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:899)
> {code}
> When this happens, the NodeManager becomes usable only after a restart.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]