Evgeny Makarov created YARN-7319:
------------------------------------
Summary: java.net.UnknownHostException when trying contact node by
hostname
Key: YARN-7319
URL: https://issues.apache.org/jira/browse/YARN-7319
Project: Hadoop YARN
Issue Type: Bug
Components: yarn
Reporter: Evgeny Makarov
I'm trying to setup Hadoop on Kubernetes cluster with following setup:
Hadoop master is k8s pod
Each hadoop slave is additional k8s pod
All communication is being processed on IP based manned. In HDFS I have setting
of dfs.namenode.datanode.registration.ip-hostname-check set to false and all
works fine, however same option missing for YARN manager.
Here part of hadoop-master log when trying to submit simple word-count job:
2017-10-12 09:00:25,005 ERROR
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
Error trying to assign container token and NM token to an allocated container
container_1507798393049_0001_01_000001
java.lang.IllegalArgumentException: java.net.UnknownHostException:
hadoop-slave-743067341-hqrbk
at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377)
at
org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerToken(BuilderUtils.java:258)
at
org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager.createContainerToken(RMContainerTokenSecretManager.java:220)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.pullNewlyAllocatedContainersAndNMTokens(SchedulerApplicationAttempt.java:454)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.getAllocation(FiCaSchedulerApp.java:269)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocate(CapacityScheduler.java:988)
at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:971)
at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:964)
at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:789)
at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:105)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:795)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:776)
at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.UnknownHostException: hadoop-slave-743067341-hqrbk
... 19 more
As can be seen, host hadoop-slave-743067341-hqrbk is unreachable. Adding record
to /ets/hosts of master will solve the problem, however its not an option in
Kubernetes environment. There is should be a way to resolve nodes by IP address
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]