Oleksandr Shevchenko created YARN-7998:
------------------------------------------
Summary: RM crashes with NPE during recovering if ACL
configuration was changed
Key: YARN-7998
URL: https://issues.apache.org/jira/browse/YARN-7998
Project: Hadoop YARN
Issue Type: Bug
Components: resourcemanager
Affects Versions: 3.0.0
Reporter: Oleksandr Shevchenko
RM crashes with NPE during failover because ACL configurations were changed as
a result we no longer have a rights to submit an application to a queue.
Scenario:
# Submit an application
# Change ACL configuration for a queue that accepted the application so that
an owner of the application will no longer have a rights to submit this
application.
# Restart RM.
As a result, we get NPE:
2018-02-27 18:14:00,968 INFO org.apache.hadoop.service.AbstractService: Service
ResourceManager failed in state STARTED; cause: java.lang.NullPointerException
java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplicationAttempt(FairScheduler.java:738)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1286)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:116)
at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AttemptRecoveredTransition.transition(RMAppAttemptImpl.java:1098)
at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AttemptRecoveredTransition.transition(RMAppAttemptImpl.java:1044)
at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]