[
https://issues.apache.org/jira/browse/YARN-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942833#comment-16942833
]
Eric Badger commented on YARN-9860:
-----------------------------------
{quote}
You are spot on that this will be an issue. The challenge is that if we mount
the read-write log dirs into the container and the container user isn't the
user YARN expects, the writes could fail or YARN may be unable to clean up the
logs. I talked with Craig on this a bit and he had some interesting thoughts on
how we might handle it with fuse. For the sake of this patch, I didn't want to
get bogged down in the details there, given this has enough going already.
Could we address logging in a follow up? In the meantime, with debug delay
enabled, doing a docker logs on the exited container will allow admins to take
a look, since the output redirection typically done by YARN is dropped in
Service Mode.
{quote}
I'm ok having a followup for this. My comment was mostly for my own
information. I'm not opposed to the feature, I just think it might be a little
bit difficult to deal with if there are configuration issues or other issues
that cause the containers to fail. I think that debug delay to not remove
containers for awhile after completion could be a useful debugging tool,
though.
> Enable service mode for Docker containers on YARN
> -------------------------------------------------
>
> Key: YARN-9860
> URL: https://issues.apache.org/jira/browse/YARN-9860
> Project: Hadoop YARN
> Issue Type: Improvement
> Affects Versions: 3.3.0
> Reporter: Prabhu Joseph
> Assignee: Prabhu Joseph
> Priority: Major
> Attachments: YARN-9860-001.patch, YARN-9860-002.patch
>
>
> This task is to add support to YARN for running Docker containers in "Service
> Mode".
> Service Mode - Run the container as defined by the image, but still allow for
> injecting configuration.
> Background:
> Entrypoint mode helped - now able to use the ENV and ENTRYPOINT/CMD as
> defined in the image. However, still requires modification to official images
> due to user propagation
> User propagation is problematic for running a secure cluster with sssd
>
> Implementation:
> Must be enabled via c-e.cfg (example: docker.service-mode.allowed=true)
> Must be requested at runtime - (example:
> YARN_CONTAINER_RUNTIME_DOCKER_SERVICE_MODE=true)
> Entrypoint mode is default enabled for this mode (If Service Mode is
> requested, YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE should be set
> to true)
> Writable log mount will not be added - stdout logging may still work
> with entrypoint mode - remove the writable bind mounts
> User and groups will not be propagated (now: docker run --user nobody
> --group-add=nobody .... <image>, after: docker run .... <image>)
> Read-only resources mounted at the file level, files get chmod 777,
> parent directory only accessible by the run as user.
> cc [[email protected]]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]