[ 
https://issues.apache.org/jira/browse/YARN-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16946374#comment-16946374
 ] 

Eric Yang commented on YARN-9860:
---------------------------------

[~Prabhu Joseph] Thank you for the patch 007.  I tested the patch on a kerberos 
enabled cluster, and containers fail to launch on the cluster.  Node manager 
error log shows:

{code}
2019-10-07 17:49:30,962 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
 Localizer failed for container_1570493636313_0004_02_000001
java.io.IOException: Application application_1570493636313_0004 initialization 
failed (exitCode=-1) with output: null
        at 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:414)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1263)
Caused by: 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
 java.io.InterruptedIOException: java.lang.InterruptedException
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:185)
        at 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:405)
        ... 1 more
Caused by: java.io.InterruptedIOException: java.lang.InterruptedException
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:1011)
        at org.apache.hadoop.util.Shell.run(Shell.java:901)
        at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1213)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:154)
        ... 2 more
Caused by: java.lang.InterruptedException
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:502)
        at java.lang.UNIXProcess.waitFor(UNIXProcess.java:395)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:1001)
        ... 5 more
{code}

I also checked the working directory.  There is no localization of keytab file 
or tokens.  Can you verify?

> Enable service mode for Docker containers on YARN
> -------------------------------------------------
>
>                 Key: YARN-9860
>                 URL: https://issues.apache.org/jira/browse/YARN-9860
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 3.3.0
>            Reporter: Prabhu Joseph
>            Assignee: Prabhu Joseph
>            Priority: Major
>         Attachments: YARN-9860-001.patch, YARN-9860-002.patch, 
> YARN-9860-003.patch, YARN-9860-004.patch, YARN-9860-005.patch, 
> YARN-9860-006.patch, YARN-9860-007.patch
>
>
> This task is to add support to YARN for running Docker containers in "Service 
> Mode". 
> Service Mode - Run the container as defined by the image, but still allow for 
> injecting configuration. 
> Background:
>       Entrypoint mode helped - now able to use the ENV and ENTRYPOINT/CMD as 
> defined in the image. However, still requires modification to official images 
> due to user propagation
> User propagation is problematic for running a secure cluster with sssd
>       
> Implementation:
>       Must be enabled via c-e.cfg (example: docker.service-mode.allowed=true)
>       Must be requested at runtime - (example: 
> YARN_CONTAINER_RUNTIME_DOCKER_SERVICE_MODE=true)
>       Entrypoint mode is default enabled for this mode (If Service Mode is 
> requested, YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE should be set 
> to true)
>       Writable log mount will not be added - stdout logging may still work 
> with entrypoint mode - remove the writable bind mounts
>       User and groups will not be propagated (now: docker run --user nobody 
> --group-add=nobody .... <image>, after: docker run .... <image>)
>       Read-only resources mounted at the file level, files get chmod 777, 
> parent directory only accessible by the run as user.
> cc [[email protected]]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to