[
https://issues.apache.org/jira/browse/YARN-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sunil G updated YARN-7692:
--------------------------
Affects Version/s: 2.9.0
2.8.3
> Resource Manager goes down when a user not included in a priority acl submits
> a job
> -----------------------------------------------------------------------------------
>
> Key: YARN-7692
> URL: https://issues.apache.org/jira/browse/YARN-7692
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 2.9.0, 2.8.3, 3.0.0
> Reporter: Charan Hebri
> Assignee: Sunil G
>
> Test scenario
> ------------------
> 1. A cluster is created, no ACLs are included
> 2. Submit jobs with an existing user say 'user_a'
> 3. Enable ACLs and create a priority ACL entry via the property
> yarn.scheduler.capacity.priority-acls. Do not include the user, 'user_a' in
> this ACL.
> 4. Submit a job with the 'user_a'
> The observed behavior in this case is that the job is rejected as 'user_a'
> does not have the permission to run the job which is expected behavior. But
> Resource Manager also goes down when it tries to recover previous
> applications and fails to recover them.
> Below is the exception seen,
> {noformat}
> 2017-12-27 10:52:30,064 INFO conf.Configuration
> (Configuration.java:getConfResourceAsInputStream(2659)) - found resource
> yarn-site.xml at file:/etc/hadoop/3.0.0.0-636/0/yarn-site.xml
> 2017-12-27 10:52:30,065 INFO scheduler.AbstractYarnScheduler
> (AbstractYarnScheduler.java:setClusterMaxPriority(911)) - Updated the cluste
> max priority to maxClusterLevelAppPriority = 10
> 2017-12-27 10:52:30,066 INFO resourcemanager.ResourceManager
> (ResourceManager.java:transitionToActive(1177)) - Transitioning to active
> state
> 2017-12-27 10:52:30,097 INFO resourcemanager.ResourceManager
> (ResourceManager.java:serviceStart(765)) - Recovery started
> 2017-12-27 10:52:30,102 INFO recovery.RMStateStore
> (RMStateStore.java:checkVersion(747)) - Loaded RM state version info 1.5
> 2017-12-27 10:52:30,375 INFO security.RMDelegationTokenSecretManager
> (RMDelegationTokenSecretManager.java:recover(196)) - recovering
> RMDelegationTokenSecretManager.
> 2017-12-27 10:52:30,380 INFO resourcemanager.RMAppManager
> (RMAppManager.java:recover(561)) - Recovering 51 applications
> 2017-12-27 10:52:30,432 INFO resourcemanager.RMAppManager
> (RMAppManager.java:recover(571)) - Successfully recovered 0 out of 51
> applications
> 2017-12-27 10:52:30,432 ERROR resourcemanager.ResourceManager
> (ResourceManager.java:serviceStart(776)) - Failed to load/recover state
> org.apache.hadoop.yarn.exceptions.YarnException:
> org.apache.hadoop.security.AccessControlException: User hrt_qa (auth:SIMPLE)
> does not have permission to submit/update application_1514268754125_0001 for 0
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.checkAndGetApplicationPriority(CapacityScheduler.java:2348)
> at
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:396)
> at
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:358)
> at
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:567)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1390)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:771)
> at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1143)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1183)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1179)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1179)
> at
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:320)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144)
> at
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:894)
> at
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:473)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:611)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510)
> Caused by: org.apache.hadoop.security.AccessControlException: User hrt_qa
> (auth:SIMPLE) does not have permission to submit/update
> application_1514268754125_0001 for 0
> ... 20 more
> 2017-12-27 10:52:30,434 INFO service.AbstractService
> (AbstractService.java:noteFailure(273)) - Service RMActiveServices failed in
> state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnException:
> org.apache.hadoop.security.AccessControlException: User hrt_qa (auth:SIMPLE)
> does not have permission to submit/update application_1514268754125_0001 for 0
> org.apache.hadoop.yarn.exceptions.YarnException:
> org.apache.hadoop.security.AccessControlException: User hrt_qa (auth:SIMPLE)
> does not have permission to submit/update application_1514268754125_0001 for 0
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.checkAndGetApplicationPriority(CapacityScheduler.java:2348)
> at
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:396)
> at
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:358)
> at
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:567)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1390)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:771)
> at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1143)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1183)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1179)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1179)
> at
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:320)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144)
> at
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:894)
> at
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:473)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:611)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510)
> Caused by: org.apache.hadoop.security.AccessControlException: User hrt_qa
> (auth:SIMPLE) does not have permission to submit/update
> application_1514268754125_0001 for 0
> ... 20 more
> 2017-12-27 10:52:30,435 INFO impl.MetricsSystemImpl
> (MetricsSystemImpl.java:stop(210)) - Stopping ResourceManager metrics
> system...
> 2017-12-27 10:52:30,435 INFO impl.MetricsSystemImpl
> (MetricsSystemImpl.java:stop(216)) - ResourceManager metrics system stopped.
> 2017-12-27 10:52:30,436 INFO impl.MetricsSystemImpl
> (MetricsSystemImpl.java:shutdown(607)) - ResourceManager metrics system
> shutdown complete.
> 2017-12-27 10:52:30,436 INFO event.AsyncDispatcher
> (AsyncDispatcher.java:serviceStop(155)) - AsyncDispatcher is draining to
> stop, ignoring any new events.
> 2017-12-27 10:52:30,437 INFO event.AsyncDispatcher
> (AsyncDispatcher.java:register(223)) - Registering class
> org.apache.hadoop.yarn.server.resourcemanager.RMFatalEventType for class
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMFatalEventDispatcher
> 2017-12-27 10:52:30,438 INFO security.NMTokenSecretManagerInRM
> (NMTokenSecretManagerInRM.java:<init>(75)) - NMTokenKeyRollingInterval:
> 86400000ms and NMTokenKeyActivationDelay: 900000ms
> 2017-12-27 10:52:30,438 INFO security.RMContainerTokenSecretManager
> (RMContainerTokenSecretManager.java:<init>(79)) -
> ContainerTokenKeyRollingInterval: 86400000ms and
> ContainerTokenKeyActivationDelay: 900000ms
> 2017-12-27 10:52:30,438 INFO security.AMRMTokenSecretManager
> (AMRMTokenSecretManager.java:<init>(94)) - AMRMTokenKeyRollingInterval:
> 86400000ms and AMRMTokenKeyActivationDelay: 900000 ms
> 2017-12-27 10:52:30,439 INFO recovery.RMStateStoreFactory
> (RMStateStoreFactory.java:getStore(33)) - Using RMStateStore implementation -
> class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
> 2017-12-27 10:52:30,439 INFO event.AsyncDispatcher
> (AsyncDispatcher.java:register(223)) - Registering class
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStoreEventType
> for class
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler
> 2017-12-27 10:52:30,439 WARN curator.CuratorZookeeperClient
> (CuratorZookeeperClient.java:<init>(96)) - session timeout [10000] is less
> than connection timeout [15000]
> 2017-12-27 10:52:30,440 INFO imps.CuratorFrameworkImpl
> (CuratorFrameworkImpl.java:start(235)) - Starting
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]