[
https://issues.apache.org/jira/browse/YARN-8983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Keqiu Hu updated YARN-8983:
---------------------------
Description:
I'm experimenting to use Hadoop 2.9.1 to launch applications with docker
containers. Inside the container task, we try to get the hostname of the
container using
{code:java}
InetAddress.getLocalHost().getHostName(){code}
This works fine with LXC, however it throws the following exception when I
enable docker container using:
{code:java}
YARN_CONTAINER_RUNTIME_TYPE=docker
YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=test4
{code}
The exception:
{noformat}
java.net.UnknownHostException: ctr-1541488751855-0023-01-000003:
ctr-1541488751855-0023-01-000003: Temporary failure in name resolution at
java.net.InetAddress.getLocalHost(InetAddress.java:1506)
at
com.linkedin.tony.TaskExecutor.registerAndGetClusterSpec(TaskExecutor.java:204)
at com.linkedin.tony.TaskExecutor.main(TaskExecutor.java:109) Caused by:
java.net.UnknownHostException: ctr-1541488751855-0023-01-000003: Temporary
failure in name resolution at
java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) at
java.net.InetAddress.getLocalHost(InetAddress.java:1501) ... 2 more
{noformat}
Did some research online, it seems to be related to missing entry in /etc/hosts
on the hostname. So I took a look at the /etc/hosts, it is missing the entry :
{noformat}
pi@pi-aw:~/docker/$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
71e3e9df8bc6 test4 "/entrypoint.sh bash..." 1 second ago Up Less than a second
container_1541488751855_0028_01_000001
29d31f0327d1 test3 "/entrypoint.sh bash" 18 hours ago Up 18 hours
blissful_turing
pi@pi-aw:~/docker/$ de 71e3e9df8bc6
groups: cannot find name for group ID 1000
groups: cannot find name for group ID 116
groups: cannot find name for group ID 126
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.
pi@ctr-1541488751855-0028-01-000001:/tmp/hadoop-pi/nm-local-dir/usercache/pi/appcache/application_1541488751855_0028/container_1541488751855_0028_01_000001$
cat /etc/hosts
127.0.0.1 localhost
192.168.0.14 pi-aw
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
pi@ctr-1541488751855-0028-01-000001:/tmp/hadoop-pi/nm-local-dir/usercache/pi/appcache/application_1541488751855_0028/container_1541488751855_0028_01_000001$
{noformat}
If I launch the image without YARN, I saw the entry in /etc/hosts:
{noformat}
pi@61f173f95631:~$ cat /etc/hosts
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
172.17.0.3 61f173f95631 {noformat}
Here is my container-executor.cfg
{code:java}
1 min.user.id=100
2 yarn.nodemanager.linux-container-executor.group=hadoop
3 [docker]
4 module.enabled=true
5 docker.binary=/usr/bin/docker
6
docker.allowed.capabilities=SYS_CHROOT,MKNOD,SETFCAP,SETPCAP,FSETID,CHOWN,AUDIT_WRITE,SETGID,NET_RAW,FOWNER,SETUID,DAC_OVERRIDE,KILL,NET_BIND_SERVICE
7 docker.allowed.networks=bridge,host,none
8
docker.allowed.rw-mounts=/tmp,/etc/hadoop/logs/,/private/etc/hadoop-2.9.1/logs/{code}
Since I'm using an older version of Hadoop 2.9.1, let me know if this is
something already fixed in later version :)
was:
I'm experimenting to use Hadoop 2.9.1 to launch applications with docker
containers. Inside the container task, we try to get the hostname of the
container using
{code:java}
InetAddress.getLocalHost().getHostName(){code}
This works fine with LXC, however it throws the following exception when I
enable docker container using:
{code:java}
YARN_CONTAINER_RUNTIME_TYPE=docker
YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=test4
{code}
The exception:
{code:java}
java.net.UnknownHostException: ctr-1541488751855-0023-01-000003:
ctr-1541488751855-0023-01-000003: Temporary failure in name resolution at
java.net.InetAddress.getLocalHost(InetAddress.java:1506) at
com.linkedin.tony.TaskExecutor.registerAndGetClusterSpec(TaskExecutor.java:204)
at com.linkedin.tony.TaskExecutor.main(TaskExecutor.java:109) Caused by:
java.net.UnknownHostException: ctr-1541488751855-0023-01-000003: Temporary
failure in name resolution at
java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at
java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) at
java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) at
java.net.InetAddress.getLocalHost(InetAddress.java:1501) ... 2 more
{code}
Did some research online, it seems to be related to missing entry in /etc/hosts
on the hostname. So I took a look at the /etc/hosts, it is missing the entry :
{noformat}
pi@pi-aw:~/docker/$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
71e3e9df8bc6 test4 "/entrypoint.sh bash..." 1 second ago Up Less than a second
container_1541488751855_0028_01_000001
29d31f0327d1 test3 "/entrypoint.sh bash" 18 hours ago Up 18 hours
blissful_turing
pi@pi-aw:~/docker/$ de 71e3e9df8bc6
groups: cannot find name for group ID 1000
groups: cannot find name for group ID 116
groups: cannot find name for group ID 126
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.
pi@ctr-1541488751855-0028-01-000001:/tmp/hadoop-pi/nm-local-dir/usercache/pi/appcache/application_1541488751855_0028/container_1541488751855_0028_01_000001$
cat /etc/hosts
127.0.0.1 localhost
192.168.0.14 pi-aw
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
pi@ctr-1541488751855-0028-01-000001:/tmp/hadoop-pi/nm-local-dir/usercache/pi/appcache/application_1541488751855_0028/container_1541488751855_0028_01_000001$
{noformat}
If I launch the image without YARN, I saw the entry in /etc/hosts:
{noformat}
pi@61f173f95631:~$ cat /etc/hosts
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
172.17.0.3 61f173f95631{noformat}
Here is my container-executor.cfg
{code:java}
1 min.user.id=100
2 yarn.nodemanager.linux-container-executor.group=hadoop
3 [docker]
4 module.enabled=true
5 docker.binary=/usr/bin/docker
6
docker.allowed.capabilities=SYS_CHROOT,MKNOD,SETFCAP,SETPCAP,FSETID,CHOWN,AUDIT_WRITE,SETGID,NET_RAW,FOWNER,SETUID,DAC_OVERRIDE,KILL,NET_BIND_SERVICE
7 docker.allowed.networks=bridge,host,none
8
docker.allowed.rw-mounts=/tmp,/etc/hadoop/logs/,/private/etc/hadoop-2.9.1/logs/{code}
Since I'm using an older version of Hadoop 2.9.1, let me know if this is
something already fixed in later version :)
> YARN container with docker: hostname entry not in /etc/hosts
> ------------------------------------------------------------
>
> Key: YARN-8983
> URL: https://issues.apache.org/jira/browse/YARN-8983
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.9.1
> Reporter: Keqiu Hu
> Priority: Critical
>
> I'm experimenting to use Hadoop 2.9.1 to launch applications with docker
> containers. Inside the container task, we try to get the hostname of the
> container using
> {code:java}
> InetAddress.getLocalHost().getHostName(){code}
> This works fine with LXC, however it throws the following exception when I
> enable docker container using:
> {code:java}
> YARN_CONTAINER_RUNTIME_TYPE=docker
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=test4
> {code}
> The exception:
>
> {noformat}
> java.net.UnknownHostException: ctr-1541488751855-0023-01-000003:
> ctr-1541488751855-0023-01-000003: Temporary failure in name resolution at
> java.net.InetAddress.getLocalHost(InetAddress.java:1506)
> at
> com.linkedin.tony.TaskExecutor.registerAndGetClusterSpec(TaskExecutor.java:204)
>
> at com.linkedin.tony.TaskExecutor.main(TaskExecutor.java:109) Caused by:
> java.net.UnknownHostException: ctr-1541488751855-0023-01-000003: Temporary
> failure in name resolution at
> java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
> at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
> at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) at
> java.net.InetAddress.getLocalHost(InetAddress.java:1501) ... 2 more
> {noformat}
>
> Did some research online, it seems to be related to missing entry in
> /etc/hosts on the hostname. So I took a look at the /etc/hosts, it is missing
> the entry :
> {noformat}
> pi@pi-aw:~/docker/$ docker ps
> CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
> 71e3e9df8bc6 test4 "/entrypoint.sh bash..." 1 second ago Up Less than a
> second container_1541488751855_0028_01_000001
> 29d31f0327d1 test3 "/entrypoint.sh bash" 18 hours ago Up 18 hours
> blissful_turing
> pi@pi-aw:~/docker/$ de 71e3e9df8bc6
> groups: cannot find name for group ID 1000
> groups: cannot find name for group ID 116
> groups: cannot find name for group ID 126
> To run a command as administrator (user "root"), use "sudo <command>".
> See "man sudo_root" for details.
> pi@ctr-1541488751855-0028-01-000001:/tmp/hadoop-pi/nm-local-dir/usercache/pi/appcache/application_1541488751855_0028/container_1541488751855_0028_01_000001$
> cat /etc/hosts
> 127.0.0.1 localhost
> 192.168.0.14 pi-aw
> # The following lines are desirable for IPv6 capable hosts
> ::1 ip6-localhost ip6-loopback
> fe00::0 ip6-localnet
> ff00::0 ip6-mcastprefix
> ff02::1 ip6-allnodes
> ff02::2 ip6-allrouters
> pi@ctr-1541488751855-0028-01-000001:/tmp/hadoop-pi/nm-local-dir/usercache/pi/appcache/application_1541488751855_0028/container_1541488751855_0028_01_000001$
> {noformat}
> If I launch the image without YARN, I saw the entry in /etc/hosts:
> {noformat}
> pi@61f173f95631:~$ cat /etc/hosts
> 127.0.0.1 localhost
> ::1 localhost ip6-localhost ip6-loopback
> fe00::0 ip6-localnet
> ff00::0 ip6-mcastprefix
> ff02::1 ip6-allnodes
> ff02::2 ip6-allrouters
> 172.17.0.3 61f173f95631 {noformat}
> Here is my container-executor.cfg
> {code:java}
> 1 min.user.id=100
> 2 yarn.nodemanager.linux-container-executor.group=hadoop
> 3 [docker]
> 4 module.enabled=true
> 5 docker.binary=/usr/bin/docker
> 6
> docker.allowed.capabilities=SYS_CHROOT,MKNOD,SETFCAP,SETPCAP,FSETID,CHOWN,AUDIT_WRITE,SETGID,NET_RAW,FOWNER,SETUID,DAC_OVERRIDE,KILL,NET_BIND_SERVICE
> 7 docker.allowed.networks=bridge,host,none
> 8
> docker.allowed.rw-mounts=/tmp,/etc/hadoop/logs/,/private/etc/hadoop-2.9.1/logs/{code}
> Since I'm using an older version of Hadoop 2.9.1, let me know if this is
> something already fixed in later version :)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]