[
https://issues.apache.org/jira/browse/YARN-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16863058#comment-16863058
]
wangxiangchun commented on YARN-4042:
-------------------------------------
hi, can I ask that how you solve the problem? I encountered the same problem ,I
follow the answer to delete the version-2 file in zkdata, but I didn't sovle
the problem? Could I ask your experience?
> YARN registry should handle the absence of ZK node
> --------------------------------------------------
>
> Key: YARN-4042
> URL: https://issues.apache.org/jira/browse/YARN-4042
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Sergey Shelukhin
> Priority: Major
>
> {noformat}
> 2015-08-10 11:33:46,931 WARN [LlapSchedulerNodeEnabler]
> rm.LlapTaskSchedulerService: Could not refresh list of active instances
> org.apache.hadoop.fs.PathNotFoundException:
> `/registry/users/huzheng/services/org-apache-hive/llap0/components/workers/worker-0000000025':
> No such file or directory: KeeperErrorCode = NoNode for
> /registry/users/huzheng/services/org-apache-hive/llap0/components/workers/worker-0000000025
> at
> org.apache.hadoop.registry.client.impl.zk.CuratorService.operationFailure(CuratorService.java:377)
> at
> org.apache.hadoop.registry.client.impl.zk.CuratorService.operationFailure(CuratorService.java:360)
> at
> org.apache.hadoop.registry.client.impl.zk.CuratorService.zkRead(CuratorService.java:720)
> at
> org.apache.hadoop.registry.client.impl.zk.RegistryOperationsService.resolve(RegistryOperationsService.java:120)
> at
> org.apache.hadoop.registry.client.binding.RegistryUtils.extractServiceRecords(RegistryUtils.java:321)
> at
> org.apache.hadoop.registry.client.binding.RegistryUtils.listServiceRecords(RegistryUtils.java:177)
> at
> org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl$DynamicServiceInstanceSet.refresh(LlapYarnRegistryImpl.java:278)
> at
> org.apache.tez.dag.app.rm.LlapTaskSchedulerService.refreshInstances(LlapTaskSchedulerService.java:584)
> at
> org.apache.tez.dag.app.rm.LlapTaskSchedulerService.access$900(LlapTaskSchedulerService.java:79)
> at
> org.apache.tez.dag.app.rm.LlapTaskSchedulerService$NodeEnablerCallable.call(LlapTaskSchedulerService.java:887)
> at
> org.apache.tez.dag.app.rm.LlapTaskSchedulerService$NodeEnablerCallable.call(LlapTaskSchedulerService.java:855)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
> KeeperErrorCode = NoNode for
> /registry/users/huzheng/services/org-apache-hive/llap0/components/workers/worker-0000000025
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
> at
> org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:302)
> at
> org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:291)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
> at
> org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
> at
> org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
> at
> org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:41)
> at
> org.apache.hadoop.registry.client.impl.zk.CuratorService.zkRead(CuratorService.java:718)
> ... 12 more
> {noformat}
> ZK nodes can disappear after listing, for example ephemeral node can be
> cleaned up. YARN registry should handle that.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]