[
https://issues.apache.org/jira/browse/YARN-9693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17762111#comment-17762111
]
zhangjunj edited comment on YARN-9693 at 9/5/23 3:56 PM:
---------------------------------------------------------
Our company has launched the YARN Federation, For details, please
see::[https://zhuanlan.zhihu.com/p/620701241]
We also met this problem, and we did not apply a patch, but directly modified
the configuration.
About Hive,Hadoop yarn-site.xml in hive dependency:
<property>
<name>yarn.resourcemanager.address</name>
<value>yarn-router-hostname:8050</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>localhost:8049</value>
</property>
About spark,Hadoop yarn-site.xml in spark dependency:
<configuration>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>r1,r2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.r1</name>
<value>yarn-router-hostname</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.r2</name>
<value>yarn-router-hostname</value>
</property>
<property>
<name>yarn.resourcemanager.address.r1</name>
<value>yarn-router-hostname:8050</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.r1</name>
<value>localhost:8049</value>
</property>
<property>
<name>yarn.resourcemanager.address.r2</name>
<value>yarn-router-hostname:8050</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.r2</name>
<value>localhost:8049</value>
</property>
</configuration>
Meanwhile, In spark-defaults.conf add configuration params:
spark.hadoop.yarn.federation.enabled true
spark.hadoop.yarn.resourcemanager.ha.rm-ids r1,r2
spark.hadoop.yarn.resourcemanager.hostname.r1
yarn-router-hostname
spark.hadoop.yarn.resourcemanager.hostname.r2
yarn-router-hostname
spark.hadoop.yarn.resourcemanager.scheduler.address.r1 localhost:8049
spark.hadoop.yarn.resourcemanager.scheduler.address.r2 localhost:8049
Spark is the same as the Flink.
If you have other questions, you can comment under
[https://zhuanlan.zhihu.com/p/620701241], and I will reply in time.
was (Author: zhangjunj):
我们公司已经上线了 YARN Federation, 详情看:[https://zhuanlan.zhihu.com/p/620701241]
这块问题我们也遇到了,没有打 patch, 而是配置问题。
使用 Hive,在 hive 依赖的 hadoop yarn-site.xml :
<property>
<name>yarn.resourcemanager.address</name>
<value>yarn-router-hostname:8050</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>localhost:8049</value>
</property>
使用 spark,spark 中需要依赖 yarn-site.xml 配置:
<configuration>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>r1,r2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.r1</name>
<value>yarn-router-hostname</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.r2</name>
<value>yarn-router-hostname</value>
</property>
<property>
<name>yarn.resourcemanager.address.r1</name>
<value>yarn-router-hostname:8050</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.r1</name>
<value>localhost:8049</value>
</property>
<property>
<name>yarn.resourcemanager.address.r2</name>
<value>yarn-router-hostname:8050</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.r2</name>
<value>localhost:8049</value>
</property>
</configuration>
同时 spark 需要在配置中增加:
spark.hadoop.yarn.federation.enabled true
spark.hadoop.yarn.resourcemanager.ha.rm-ids r1,r2
spark.hadoop.yarn.resourcemanager.hostname.r1
yarn-router-hostname
spark.hadoop.yarn.resourcemanager.hostname.r2
yarn-router-hostname
spark.hadoop.yarn.resourcemanager.scheduler.address.r1 localhost:8049
spark.hadoop.yarn.resourcemanager.scheduler.address.r2 localhost:8049
Flink 同理。
若是还有什么问题,可以在 :[https://zhuanlan.zhihu.com/p/620701241] 评论,看到会回复。
> When AMRMProxyService is enabled RMCommunicator will register with failure
> --------------------------------------------------------------------------
>
> Key: YARN-9693
> URL: https://issues.apache.org/jira/browse/YARN-9693
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: federation
> Affects Versions: 3.1.2
> Reporter: zhoukang
> Assignee: zhoukang
> Priority: Major
> Attachments: YARN-9693.001.patch
>
>
> When we enable amrm proxy service, the RMCommunicator will register with
> failure below:
> {code:java}
> 2019-07-23 17:12:44,794 INFO [TaskHeartbeatHandler PingChecker]
> org.apache.hadoop.mapreduce.v2.app.TaskHeartbeatHandler: TaskHeartbeatHandler
> thread interrupted
> 2019-07-23 17:12:44,794 ERROR [main]
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid
> AMRMToken from appattempt_1563872237585_0001_000002
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:186)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:123)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280)
> at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStart(MRAppMaster.java:986)
> at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1300)
> at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run(MRAppMaster.java:1768)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1716)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1764)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1698)
> Caused by: org.apache.hadoop.security.token.SecretManager$InvalidToken:
> Invalid AMRMToken from appattempt_1563872237585_0001_000002
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
> at
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateIOException(RPCUtil.java:80)
> at
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:119)
> at
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:109)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> at com.sun.proxy.$Proxy93.registerApplicationMaster(Unknown Source)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:170)
> ... 14 more
> Caused by:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
> Invalid AMRMToken from appattempt_1563872237585_0001_000002
> at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1541)
> at org.apache.hadoop.ipc.Client.call(Client.java:1487)
> at org.apache.hadoop.ipc.Client.call(Client.java:1397)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
> at com.sun.proxy.$Proxy92.registerApplicationMaster(Unknown Source)
> at
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:107)
> {code}
> We config NM with configuration below:
> {code:java}
> yarn.nodemanager.amrmproxy.enabled true
> yarn.nodemanager.amrmproxy.interceptor-class.pipeline
> org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]