Edward Capriolo created YARN-11914:
--------------------------------------
Summary: Unknown host exceptions not well handled
Key: YARN-11914
URL: https://issues.apache.org/jira/browse/YARN-11914
Project: Hadoop YARN
Issue Type: Bug
Reporter: Edward Capriolo
{code:java}
025-12-31 16:58:33,844 ERROR [main]
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
java.lang.IllegalArgumentException: java.net.UnknownHostException: rm2
at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:479)
at
org.apache.hadoop.yarn.client.ClientRMProxy.getTokenService(ClientRMProxy.java:178)
at
org.apache.hadoop.yarn.client.ClientRMProxy.getAMRMTokenService(ClientRMProxy.java:163)
at
org.apache.hadoop.yarn.client.ClientRMProxy.setAMRMTokenService(ClientRMProxy.java:105)
at
org.apache.hadoop.yarn.client.ClientRMProxy.getRMAddress(ClientRMProxy.java:124)
at
org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.getProxyInternal(ConfiguredRMFailoverProxyProvider.java:79)
at
org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.getProxy(ConfiguredRMFailoverProxyProvider.java:93)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$ProxyDescriptor.<init>(RetryInvocationHandler.java:202)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.<init>(RetryInvocationHandler.java:335)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.<init>(RetryInvocationHandler.java:329)
at org.apache.hadoop.io.retry.RetryProxy.create(RetryProxy.java:61)
at
org.apache.hadoop.yarn.client.RMProxy.newProxyInstance(RMProxy.java:194)
at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:116)
at
org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:74)
at
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.createSchedulerProxy(RMCommunicator.java:312)
at
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:118)
at
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280)
at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:195)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStart(MRAppMaster.java:978)
at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:195)
at
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:123)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1292)
at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:195)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run(MRAppMaster.java:1768)
at
java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
at java.base/javax.security.auth.Subject.doAs(Subject.java:439)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1953)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1764)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1685)
Caused by: java.net.UnknownHostException: rm2
... 29 more
{code}
Not many java systems trap unknown host well. They assume that DNS is stable
which for "raw iron" is usually correct. However with docker/ k8s host names
vanish soon after the host. In the case above RM2 not existing causes yarn
clients to be unable even though RM1 is up.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]