[
https://issues.apache.org/jira/browse/YARN-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15374436#comment-15374436
]
Zhankun Tang commented on YARN-5360:
------------------------------------
Current LCE (DockerLinuxContainerRuntime) is mounting /etc/passwd to the
container. But this approach does not work per my testing. And this approach
is also invasive and can lead to user confusion and frustration. Here I post
the error log in my single node cluster:
{panel}
2016-07-13 21:55:34,870 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
Shell execution returned exit code: 7. Privileged Execution Operation Output:
main : command provided 4
main : run as user is yarn
main : requested yarn user is yarn
Creating script paths...
Creating local dirs...
Getting exit code file...
Changing effective user to root...
Launching docker container...
Full command array for failed execution:
/home/yarn/code/apache_hadoop/hadoop/hadoop-dist/target/hadoop-2.8.0-SNAPSHOT/bin/container-executor,
yarn, yarn, 4, application_1468316940186_0004,
container_1468316940186_0004_01_000002,
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1468316940186_0004/container_1468316940186_0004_01_000002,
/tmp/hadoop-yarn/nm-local-dir/nmPrivate/application_1468316940186_0004/container_1468316940186_0004_01_000002/launch_container.sh,
/tmp/hadoop-yarn/nm-local-dir/nmPrivate/application_1468316940186_0004/container_1468316940186_0004_01_000002/container_1468316940186_0004_01_000002.tokens,
/tmp/hadoop-yarn/nm-local-dir/nmPrivate/application_1468316940186_0004/container_1468316940186_0004_01_000002/container_1468316940186_0004_01_000002.pid,
/tmp/hadoop-yarn/nm-local-dir,
/home/yarn/code/apache_hadoop/hadoop/hadoop-dist/target/hadoop-2.8.0-SNAPSHOT/logs/userlogs,
/tmp/hadoop-yarn/nm-docker-cmds/docker.container_1468316940186_0004_01_0000022899997369798306591.cmd,
cgroups=none
2016-07-13 21:55:34,870 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime:
Launch container failed. Exception:
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
ExitCodeException exitCode=7: docker: Error response from daemon: Cannot start
container c75c395c6ed228144a383e9030d43915e263c7ca6d512e4a7cd25d5fbeffae0a: 9
System error: Unable to find user yarn.
Could not invoke docker /usr/bin/docker run
--name=container_1468316940186_0004_01_000002 --user=yarn -d
--workdir=/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1468316940186_0004/container_1468316940186_0004_01_000002
--net=host --cap-drop=ALL --cap-add=AUDIT_WRITE --cap-add=NET_RAW
--cap-add=SETGID --cap-add=SETUID --cap-add=NET_BIND_SERVICE --cap-add=SETFCAP
--cap-add=FSETID --cap-add=SETPCAP --cap-add=SYS_CHROOT --cap-add=CHOWN
--cap-add=FOWNER --cap-add=MKNOD --cap-add=KILL --cap-add=DAC_OVERRIDE -v
/etc/passwd:/etc/password:ro -v
/tmp/hadoop-yarn/nm-local-dir:/tmp/hadoop-yarn/nm-local-dir -v
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1468316940186_0004/container_1468316940186_0004_01_000002:/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1468316940186_0004/container_1468316940186_0004_01_000002
-v
/home/yarn/code/apache_hadoop/hadoop/hadoop-dist/target/hadoop-2.8.0-SNAPSHOT/logs/userlogs:/home/yarn/code/apache_hadoop/hadoop/hadoop-dist/target/hadoop-2.8.0-SNAPSHOT/logs/userlogs
centos bash
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1468316940186_0004/container_1468316940186_0004_01_000002/launch_container.sh.
{panel}
I copied the command and add options which mount the "/etc/group",
"/etc/shadow". Still not working.
> Use UID instead of user name to build the Docker run command
> ------------------------------------------------------------
>
> Key: YARN-5360
> URL: https://issues.apache.org/jira/browse/YARN-5360
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: yarn
> Reporter: Zhankun Tang
> Assignee: Zhankun Tang
>
> There is *a dependency between job submitting user and the user in the Docker
> image* in LCE currently. For instance, in order to run the Docker container
> as yarn user, we can choose set the
> "yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user" to yarn
> and leave
> "yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users"
> default (true). Then LCE will choose yarn ( UID maybe 1001) as the user
> running jobs.
> But because LCE will mount the generated launch_container.sh (owned by the
> running job user) into the Docker container and utilizes "docker run
> --user=<run_as_user>" option to get it done internally, we also need to
> create a *same user name* in the Docker image with the *same UID* as the
> running job user. Otherwise LCE will fail to launch container or report
> unable to find user. This burdens the Docker image creator with YARN
> dependency.
> Luckily this can be solved through Docker. As far as I know, since Docker
> v1.8 (or maybe earlier), the Docker run command "--user=" option accepts UID
> and *when passing UID, the user does not have to exist in the container*. So
> we should use UID instead of user name to construct the Docker run command to
> eliminate the dependency that create the same user in the Docker image. This
> enables LCE the ability to launch any Docker container safely regardless what
> users in it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]