[ 
https://issues.apache.org/jira/browse/YARN-8983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16680660#comment-16680660
 ] 

Eric Yang commented on YARN-8983:
---------------------------------

[~oliverhuh...@gmail.com] Docker overlay network can work without swarm. RPC 
call works as you indicated in the first diagram. The setup instruction is 
written in this link.  It is a straight forward process.

> YARN container with docker: hostname entry not in /etc/hosts
> ------------------------------------------------------------
>
>                 Key: YARN-8983
>                 URL: https://issues.apache.org/jira/browse/YARN-8983
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.9.1
>            Reporter: Keqiu Hu
>            Priority: Critical
>              Labels: Docker
>
> I'm experimenting to use Hadoop 2.9.1 to launch applications with docker 
> containers. Inside the container task, we try to get the hostname of the 
> container using
> {code:java}
> InetAddress.getLocalHost().getHostName(){code}
> This works fine with LXC, however it throws the following exception when I 
> enable docker container using: 
> {code:java}
> YARN_CONTAINER_RUNTIME_TYPE=docker 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=test4
> {code}
> The exception:
>  
> {noformat}
> java.net.UnknownHostException: ctr-1541488751855-0023-01-000003: 
> ctr-1541488751855-0023-01-000003: Temporary failure in name resolution at 
> java.net.InetAddress.getLocalHost(InetAddress.java:1506)
>  at 
> com.linkedin.tony.TaskExecutor.registerAndGetClusterSpec(TaskExecutor.java:204)
>  
> at com.linkedin.tony.TaskExecutor.main(TaskExecutor.java:109) Caused by: 
> java.net.UnknownHostException: ctr-1541488751855-0023-01-000003: Temporary 
> failure in name resolution at 
> java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) 
> at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) 
> at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) at 
> java.net.InetAddress.getLocalHost(InetAddress.java:1501) ... 2 more
> {noformat}
>  
> Did some research online, it seems to be related to missing entry in 
> /etc/hosts on the hostname. So I took a look at the /etc/hosts, it is missing 
> the entry : 
> {noformat}
> pi@pi-aw:~/docker/$ docker ps
> CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
> 71e3e9df8bc6 test4 "/entrypoint.sh bash..." 1 second ago Up Less than a 
> second container_1541488751855_0028_01_000001
> 29d31f0327d1 test3 "/entrypoint.sh bash" 18 hours ago Up 18 hours 
> blissful_turing
> pi@pi-aw:~/docker/$ de 71e3e9df8bc6
> groups: cannot find name for group ID 1000
> groups: cannot find name for group ID 116
> groups: cannot find name for group ID 126
> To run a command as administrator (user "root"), use "sudo <command>".
> See "man sudo_root" for details.
> pi@ctr-1541488751855-0028-01-000001:/tmp/hadoop-pi/nm-local-dir/usercache/pi/appcache/application_1541488751855_0028/container_1541488751855_0028_01_000001$
>  cat /etc/hosts
> 127.0.0.1 localhost
> 192.168.0.14 pi-aw
> # The following lines are desirable for IPv6 capable hosts
> ::1 ip6-localhost ip6-loopback
> fe00::0 ip6-localnet
> ff00::0 ip6-mcastprefix
> ff02::1 ip6-allnodes
> ff02::2 ip6-allrouters
> pi@ctr-1541488751855-0028-01-000001:/tmp/hadoop-pi/nm-local-dir/usercache/pi/appcache/application_1541488751855_0028/container_1541488751855_0028_01_000001$
> {noformat}
> If I launch the image without YARN, I saw the entry in /etc/hosts:
> {noformat}
> pi@61f173f95631:~$ cat /etc/hosts
> 127.0.0.1 localhost
> ::1 localhost ip6-localhost ip6-loopback
> fe00::0 ip6-localnet
> ff00::0 ip6-mcastprefix
> ff02::1 ip6-allnodes
> ff02::2 ip6-allrouters
> 172.17.0.3 61f173f95631 {noformat}
> Here is my container-executor.cfg
> {code:java}
>  1 min.user.id=100
>  2 yarn.nodemanager.linux-container-executor.group=hadoop
>  3 [docker]
>  4 module.enabled=true
>  5 docker.binary=/usr/bin/docker
>  6 
> docker.allowed.capabilities=SYS_CHROOT,MKNOD,SETFCAP,SETPCAP,FSETID,CHOWN,AUDIT_WRITE,SETGID,NET_RAW,FOWNER,SETUID,DAC_OVERRIDE,KILL,NET_BIND_SERVICE
>  7 docker.allowed.networks=bridge,host,none
>  8 
> docker.allowed.rw-mounts=/tmp,/etc/hadoop/logs/,/private/etc/hadoop-2.9.1/logs/{code}
>  Since I'm using an older version of Hadoop 2.9.1, let me know if this is 
> something already fixed in later version :) 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to