[
https://issues.apache.org/jira/browse/YARN-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351175#comment-16351175
]
Eric Yang commented on YARN-7884:
---------------------------------
ServiceScheduler sets znode /registry/users/hbase/services/yarn-service to
{code:java}
'world,'anyone
: r
'sasl,'yarn
: cdrwa
'sasl,'rm
: cdrwa
'sasl,'hbase
: cdrwa{code}
For some reason, the world:anyone:r permission is injected to yarn-service
node, and prevents the child node to be written.
In theory, the evaluation of sasl:rm or sasl:hbase should allow the child node
to be written, but this is not happening.
> Race condition in registering YARN service in ZooKeeper
> -------------------------------------------------------
>
> Key: YARN-7884
> URL: https://issues.apache.org/jira/browse/YARN-7884
> Project: Hadoop YARN
> Issue Type: Bug
> Components: yarn-native-services
> Affects Versions: 3.1.0
> Reporter: Eric Yang
> Priority: Major
>
> In Kerberos enabled cluster, there seems to be a race condition for
> registering YARN service.
> Yarn-service znode creation seems to happen after AM started and reporting
> back to update components information. For some reason, Yarnservice znode
> should have access to create the znode, but reported NoAuth.
> {code}
> 2018-02-02 22:53:30,442 [main] INFO service.ServiceScheduler - Set registry
> user accounts: sasl:hbase
> 2018-02-02 22:53:30,471 [main] INFO zk.RegistrySecurity - Registry default
> system acls:
> [1,s{'world,'anyone}
> , 31,s{'sasl,'yarn}
> , 31,s{'sasl,'jhs}
> , 31,s{'sasl,'hdfs-demo}
> , 31,s{'sasl,'rm}
> , 31,s{'sasl,'hive}
> ]
> 2018-02-02 22:53:30,472 [main] INFO zk.RegistrySecurity - Registry User ACLs
> [31,s{'sasl,'hbase}
> , 31,s{'sasl,'hbase}
> ]
> 2018-02-02 22:53:30,503 [main] INFO event.AsyncDispatcher - Registering
> class org.apache.hadoop.yarn.service.component.ComponentEventType for class
> org.apache.hadoop.yarn.service.ServiceScheduler$ComponentEventHandler
> 2018-02-02 22:53:30,504 [main] INFO event.AsyncDispatcher - Registering
> class
> org.apache.hadoop.yarn.service.component.instance.ComponentInstanceEventType
> for class
> org.apache.hadoop.yarn.service.ServiceScheduler$ComponentInstanceEventHandler
> 2018-02-02 22:53:30,528 [main] INFO impl.NMClientAsyncImpl - Upper bound of
> the thread pool size is 500
> 2018-02-02 22:53:30,531 [main] INFO service.ServiceMaster - Starting service
> as user hbase/[email protected] (auth:KERBEROS)
> 2018-02-02 22:53:30,545 [main] INFO ipc.CallQueueManager - Using callQueue:
> class java.util.concurrent.LinkedBlockingQueue queueCapacity: 100 scheduler:
> class org.apache.hadoop.ipc.DefaultRpcScheduler
> 2018-02-02 22:53:30,554 [Socket Reader #1 for port 56859] INFO ipc.Server -
> Starting Socket Reader #1 for port 56859
> 2018-02-02 22:53:30,589 [main] INFO pb.RpcServerFactoryPBImpl - Adding
> protocol org.apache.hadoop.yarn.service.impl.pb.service.ClientAMProtocolPB to
> the server
> 2018-02-02 22:53:30,606 [IPC Server Responder] INFO ipc.Server - IPC Server
> Responder: starting
> 2018-02-02 22:53:30,607 [IPC Server listener on 56859] INFO ipc.Server - IPC
> Server listener on 56859: starting
> 2018-02-02 22:53:30,607 [main] INFO service.ClientAMService - Instantiated
> ClientAMService at eyang-5.openstacklocal/172.26.111.20:56859
> 2018-02-02 22:53:30,609 [main] INFO zk.CuratorService - Creating
> CuratorService with connection fixed ZK quorum "eyang-1.openstacklocal:2181"
> 2018-02-02 22:53:30,615 [main] INFO zk.RegistrySecurity - Enabling ZK sasl
> client: jaasClientEntry = Client, principal =
> hbase/[email protected], keytab =
> /etc/security/keytabs/hbase.service.keytab
> 2018-02-02 22:53:30,752 [main] INFO client.RMProxy - Connecting to
> ResourceManager at eyang-1.openstacklocal/172.26.111.17:8032
> 2018-02-02 22:53:30,909 [main] INFO service.ServiceScheduler - Registering
> appattempt_1517611904996_0001_000001, abc into registry
> 2018-02-02 22:53:30,911 [main] INFO service.ServiceScheduler - Received 0
> containers from previous attempt.
> 2018-02-02 22:53:31,072 [main] INFO service.ServiceScheduler - Could not
> read component paths: `/users/hbase/services/yarn-service/abc/components': No
> such file or directory: KeeperErrorCode = NoNode for
> /registry/users/hbase/services/yarn-service/abc/components
> 2018-02-02 22:53:31,074 [main] INFO service.ServiceScheduler - Triggering
> initial evaluation of component sleeper
> 2018-02-02 22:53:31,075 [main] INFO component.Component - [INIT COMPONENT
> sleeper]: 2 instances.
> 2018-02-02 22:53:31,094 [main] INFO component.Component - [COMPONENT
> sleeper] Transitioned from INIT to FLEXING on FLEX event.
> 2018-02-02 22:53:31,215 [pool-5-thread-1] ERROR service.ServiceScheduler -
> Failed to register app abc in registry
> org.apache.hadoop.registry.client.exceptions.NoPathPermissionsException:
> `/registry/users/hbase/services/yarn-service/abc': Not authorized to access
> path; ACLs: [
> 0x01: 'world,'anyone
> 0x1f: 'sasl,'yarn
> 0x1f: 'sasl,'jhs
> 0x1f: 'sasl,'hdfs-demo
> 0x1f: 'sasl,'rm
> 0x1f: 'sasl,'hive
> 0x1f: 'sasl,'hbase
> 0x1f: 'sasl,'hbase
> ]: KeeperErrorCode = NoAuth for
> /registry/users/hbase/services/yarn-service/abc
> at
> org.apache.hadoop.registry.client.impl.zk.CuratorService.operationFailure(CuratorService.java:412)
> at
> org.apache.hadoop.registry.client.impl.zk.CuratorService.zkCreate(CuratorService.java:637)
> at
> org.apache.hadoop.registry.client.impl.zk.CuratorService.zkSet(CuratorService.java:679)
> at
> org.apache.hadoop.registry.client.impl.zk.RegistryOperationsService.bind(RegistryOperationsService.java:116)
> at
> org.apache.hadoop.yarn.service.registry.YarnRegistryViewForProviders.putService(YarnRegistryViewForProviders.java:195)
> at
> org.apache.hadoop.yarn.service.registry.YarnRegistryViewForProviders.registerSelf(YarnRegistryViewForProviders.java:210)
> at
> org.apache.hadoop.yarn.service.ServiceScheduler$2.run(ServiceScheduler.java:462)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.zookeeper.KeeperException$NoAuthException:
> KeeperErrorCode = NoAuth for /registry/users/hbase/services/yarn-service/abc
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
> at
> org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:740)
> at
> org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:723)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109)
> at
> org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:720)
> at
> org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:484)
> at
> org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:474)
> at
> org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:260)
> at
> org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:214)
> at
> org.apache.hadoop.registry.client.impl.zk.CuratorService.zkCreate(CuratorService.java:635)
> ... 12 more
> 2018-02-02 22:53:33,135 [AMRM Callback Handler Thread] INFO
> service.ServiceScheduler - 2 containers allocated.
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]