[ 
https://issues.apache.org/jira/browse/YARN-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17802688#comment-17802688
 ] 

Shilun Fan commented on YARN-7884:
----------------------------------

Bulk update: moved all 3.4.0 non-blocker issues, please move back if it is a 
blocker. Retarget 3.5.0.

> Race condition in registering YARN service in ZooKeeper
> -------------------------------------------------------
>
>                 Key: YARN-7884
>                 URL: https://issues.apache.org/jira/browse/YARN-7884
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn-native-services
>    Affects Versions: 3.1.0
>            Reporter: Eric Yang
>            Priority: Major
>
> In Kerberos enabled cluster, there seems to be a race condition for 
> registering YARN service.
> Yarn-service znode creation seems to happen after AM started and reporting 
> back to update components information.  For some reason, Yarnservice znode 
> should have access to create the znode, but reported NoAuth.
> {code}
> 2018-02-02 22:53:30,442 [main] INFO  service.ServiceScheduler - Set registry 
> user accounts: sasl:hbase
> 2018-02-02 22:53:30,471 [main] INFO  zk.RegistrySecurity - Registry default 
> system acls: 
> [1,s{'world,'anyone}
> , 31,s{'sasl,'yarn}
> , 31,s{'sasl,'jhs}
> , 31,s{'sasl,'hdfs-demo}
> , 31,s{'sasl,'rm}
> , 31,s{'sasl,'hive}
> ]
> 2018-02-02 22:53:30,472 [main] INFO  zk.RegistrySecurity - Registry User ACLs 
> [31,s{'sasl,'hbase}
> , 31,s{'sasl,'hbase}
> ]
> 2018-02-02 22:53:30,503 [main] INFO  event.AsyncDispatcher - Registering 
> class org.apache.hadoop.yarn.service.component.ComponentEventType for class 
> org.apache.hadoop.yarn.service.ServiceScheduler$ComponentEventHandler
> 2018-02-02 22:53:30,504 [main] INFO  event.AsyncDispatcher - Registering 
> class 
> org.apache.hadoop.yarn.service.component.instance.ComponentInstanceEventType 
> for class 
> org.apache.hadoop.yarn.service.ServiceScheduler$ComponentInstanceEventHandler
> 2018-02-02 22:53:30,528 [main] INFO  impl.NMClientAsyncImpl - Upper bound of 
> the thread pool size is 500
> 2018-02-02 22:53:30,531 [main] INFO  service.ServiceMaster - Starting service 
> as user hbase/eyang-5.openstacklo...@example.com (auth:KERBEROS)
> 2018-02-02 22:53:30,545 [main] INFO  ipc.CallQueueManager - Using callQueue: 
> class java.util.concurrent.LinkedBlockingQueue queueCapacity: 100 scheduler: 
> class org.apache.hadoop.ipc.DefaultRpcScheduler
> 2018-02-02 22:53:30,554 [Socket Reader #1 for port 56859] INFO  ipc.Server - 
> Starting Socket Reader #1 for port 56859
> 2018-02-02 22:53:30,589 [main] INFO  pb.RpcServerFactoryPBImpl - Adding 
> protocol org.apache.hadoop.yarn.service.impl.pb.service.ClientAMProtocolPB to 
> the server
> 2018-02-02 22:53:30,606 [IPC Server Responder] INFO  ipc.Server - IPC Server 
> Responder: starting
> 2018-02-02 22:53:30,607 [IPC Server listener on 56859] INFO  ipc.Server - IPC 
> Server listener on 56859: starting
> 2018-02-02 22:53:30,607 [main] INFO  service.ClientAMService - Instantiated 
> ClientAMService at eyang-5.openstacklocal/172.26.111.20:56859
> 2018-02-02 22:53:30,609 [main] INFO  zk.CuratorService - Creating 
> CuratorService with connection fixed ZK quorum "eyang-1.openstacklocal:2181" 
> 2018-02-02 22:53:30,615 [main] INFO  zk.RegistrySecurity - Enabling ZK sasl 
> client: jaasClientEntry = Client, principal = 
> hbase/eyang-5.openstacklo...@example.com, keytab = 
> /etc/security/keytabs/hbase.service.keytab
> 2018-02-02 22:53:30,752 [main] INFO  client.RMProxy - Connecting to 
> ResourceManager at eyang-1.openstacklocal/172.26.111.17:8032
> 2018-02-02 22:53:30,909 [main] INFO  service.ServiceScheduler - Registering 
> appattempt_1517611904996_0001_000001, abc into registry
> 2018-02-02 22:53:30,911 [main] INFO  service.ServiceScheduler - Received 0 
> containers from previous attempt.
> 2018-02-02 22:53:31,072 [main] INFO  service.ServiceScheduler - Could not 
> read component paths: `/users/hbase/services/yarn-service/abc/components': No 
> such file or directory: KeeperErrorCode = NoNode for 
> /registry/users/hbase/services/yarn-service/abc/components
> 2018-02-02 22:53:31,074 [main] INFO  service.ServiceScheduler - Triggering 
> initial evaluation of component sleeper
> 2018-02-02 22:53:31,075 [main] INFO  component.Component - [INIT COMPONENT 
> sleeper]: 2 instances.
> 2018-02-02 22:53:31,094 [main] INFO  component.Component - [COMPONENT 
> sleeper] Transitioned from INIT to FLEXING on FLEX event.
> 2018-02-02 22:53:31,215 [pool-5-thread-1] ERROR service.ServiceScheduler - 
> Failed to register app abc in registry
> org.apache.hadoop.registry.client.exceptions.NoPathPermissionsException: 
> `/registry/users/hbase/services/yarn-service/abc': Not authorized to access 
> path; ACLs: [
> 0x01: 'world,'anyone
>  0x1f: 'sasl,'yarn
>  0x1f: 'sasl,'jhs
>  0x1f: 'sasl,'hdfs-demo
>  0x1f: 'sasl,'rm
>  0x1f: 'sasl,'hive
>  0x1f: 'sasl,'hbase
>  0x1f: 'sasl,'hbase
>  ]: KeeperErrorCode = NoAuth for 
> /registry/users/hbase/services/yarn-service/abc
>       at 
> org.apache.hadoop.registry.client.impl.zk.CuratorService.operationFailure(CuratorService.java:412)
>       at 
> org.apache.hadoop.registry.client.impl.zk.CuratorService.zkCreate(CuratorService.java:637)
>       at 
> org.apache.hadoop.registry.client.impl.zk.CuratorService.zkSet(CuratorService.java:679)
>       at 
> org.apache.hadoop.registry.client.impl.zk.RegistryOperationsService.bind(RegistryOperationsService.java:116)
>       at 
> org.apache.hadoop.yarn.service.registry.YarnRegistryViewForProviders.putService(YarnRegistryViewForProviders.java:195)
>       at 
> org.apache.hadoop.yarn.service.registry.YarnRegistryViewForProviders.registerSelf(YarnRegistryViewForProviders.java:210)
>       at 
> org.apache.hadoop.yarn.service.ServiceScheduler$2.run(ServiceScheduler.java:462)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.zookeeper.KeeperException$NoAuthException: 
> KeeperErrorCode = NoAuth for /registry/users/hbase/services/yarn-service/abc
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>       at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
>       at 
> org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:740)
>       at 
> org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:723)
>       at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109)
>       at 
> org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:720)
>       at 
> org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:484)
>       at 
> org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:474)
>       at 
> org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:260)
>       at 
> org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:214)
>       at 
> org.apache.hadoop.registry.client.impl.zk.CuratorService.zkCreate(CuratorService.java:635)
>       ... 12 more
> 2018-02-02 22:53:33,135 [AMRM Callback Handler Thread] INFO  
> service.ServiceScheduler - 2 containers allocated. 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to