[
https://issues.apache.org/jira/browse/YARN-8470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16610820#comment-16610820
]
ASF GitHub Bot commented on YARN-8470:
--------------------------------------
GitHub user gg7 opened a pull request:
https://github.com/apache/hadoop/pull/416
YARN-8470. Fix a NPE in identifyContainersToPreemptOnNode()
I encountered this issue while running 3.1.0:
```
2018-09-10 13:42:39,437 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler:
Container container_1536156801471_0071_01_000055 completed with event FINISHED,
but corresponding RMContainer doesn't exist.
2018-09-10 13:42:39,881 ERROR
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received
RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread,
FSPreemptionThread, that exited unexpectedly: java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptForOneContainer(FSPreemptionThread.java:161)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreempt(FSPreemptionThread.java:121)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:81)
2018-09-10 13:42:39,886 FATAL
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down
the resource manager.
2018-09-10 13:42:39,891 INFO org.apache.hadoop.util.ExitUtil: Exiting with
status 1: a critical thread, FSPreemptionThread, that exited unexpectedly:
java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptForOneContainer(FSPreemptionThread.java:161)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreempt(FSPreemptionThread.java:121)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:81)
```
I'm guessing a better fix would be to synchronise the removal of
applications, but this simple patch should be an improvement IMO.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gg7/hadoop gg7-yarn-8470-fix-npe
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/hadoop/pull/416.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #416
----
commit a86c54c4db3954aca40ef297135a5e875c0a96a8
Author: George G <git@...>
Date: 2018-09-11T15:00:00Z
YARN-8470. Fix a NPE in identifyContainersToPreemptOnNode()
I encountered this issue while running 3.1.0:
```
2018-09-10 13:42:39,437 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler:
Container container_1536156801471_0071_01_000055 completed with event FINISHED,
but corresponding RMContainer doesn't exist.
2018-09-10 13:42:39,881 ERROR
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received
RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread,
FSPreemptionThread, that exited unexpectedly: java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptForOneContainer(FSPreemptionThread.java:161)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreempt(FSPreemptionThread.java:121)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:81)
2018-09-10 13:42:39,886 FATAL
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down
the resource manager.
2018-09-10 13:42:39,891 INFO org.apache.hadoop.util.ExitUtil: Exiting with
status 1: a critical thread, FSPreemptionThread, that exited unexpectedly:
java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptForOneContainer(FSPreemptionThread.java:161)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreempt(FSPreemptionThread.java:121)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:81)
```
I'm guessing a better fix would be to synchronise the removal of
applications,
but this simple patch should be an improvement IMO.
Signed-off-by: George G <[email protected]>
----
> Fair scheduler exception with SLS
> ---------------------------------
>
> Key: YARN-8470
> URL: https://issues.apache.org/jira/browse/YARN-8470
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Miklos Szegedi
> Assignee: Haibo Chen
> Priority: Major
>
> I ran into the following exception with sls:
> 2018-06-26 13:34:04,358 ERROR resourcemanager.ResourceManager: Received
> RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread,
> FSPreemptionThread, that exited unexpectedly: java.lang.NullPointerException
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptForOneContainer(FSPreemptionThread.java:161)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreempt(FSPreemptionThread.java:121)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:81)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]