[
https://issues.apache.org/jira/browse/YARN-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113767#comment-16113767
]
stefanlee commented on YARN-2823:
---------------------------------
IMO, NPE happened when *transferStateFromPreviousAttempt* is *true* ,and the
value of *transferStateFromPreviousAttempt* is depend on
*KeepContainersAcrossApplicationAttempts* in *ApplicationSubmissionContext*, i
have this NPE,because there is *FLINK* type application running in my cluster,
then i saw the default value of *KeepContainersAcrossApplicationAttempts* in
flink code is *true*. so, i want to know if
*KeepContainersAcrossApplicationAttempts* is *false*, then this NPE can not
happened?[~jianhe] thanks
> NullPointerException in RM HA enabled 3-node cluster
> ----------------------------------------------------
>
> Key: YARN-2823
> URL: https://issues.apache.org/jira/browse/YARN-2823
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 2.6.0
> Reporter: Gour Saha
> Assignee: Jian He
> Priority: Critical
> Fix For: 2.6.0
>
> Attachments: logs_with_NPE_in_RM.zip, YARN-2823.1.patch
>
>
> Branch:
> 2.6.0
> Environment:
> A 3-node cluster with RM HA enabled. The HA setup went pretty smooth (used
> Ambari) and then installed HBase using Slider. After some time the RMs went
> down and would not come back up anymore. Following is the NPE we see in both
> the RM logs.
> {noformat}
> 2014-09-16 01:36:28,037 FATAL resourcemanager.ResourceManager
> (ResourceManager.java:run(612)) - Error in handling event type
> APP_ATTEMPT_ADDED to the scheduler
> java.lang.NullPointerException
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.transferStateFromPreviousAttempt(SchedulerApplicationAttempt.java:530)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.addApplicationAttempt(CapacityScheduler.java:678)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1015)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:98)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:603)
> at java.lang.Thread.run(Thread.java:744)
> 2014-09-16 01:36:28,042 INFO resourcemanager.ResourceManager
> (ResourceManager.java:run(616)) - Exiting, bbye..
> {noformat}
> All the logs for this 3-node cluster has been uploaded.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]