[
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17017312#comment-17017312
]
Eric Badger commented on YARN-9292:
-----------------------------------
bq. Very good question, and the answer is somewhat complicated. For AM to run
in the docker container, AM must have identical Hadoop client bits (Java,
Hadoop, etc), and credential mapping (nscd/sssd). Many of those pieces can not
be moved cleanly into Docker container in the first implementation of YARN
native service (LLAP/Slider alike projects) because resistance of building
agreeable docker image as part of Hadoop project. AM remains as outside of
docker container for simplicity.
So I read your last comment and I think that everything pretty much makes sense
if we can fix the issue of the AM not running in a Docker container. That way
we can use YARN-9184 to pull the image and get the most up to date sha for the
entire job to run with. And if an admin wants to do the image management
themselves then they don't enable YARN-9184 and are responsible to have the
images on the cluster that they want there. At that point, any errors would be
for them to fix through their own automation.
I do have some questions on why we can't move the AM into a docker container
though. What is it that is special about the AM that we need to run it directly
on the host? What does it depend on the host for? We should be able to use the
distributed cache to localize any libraries/jars that it needs. And as far as
nscd/sssd, those can be bind-mounted into the container via configs. If they
don't have nscd/sssd then they can bind-mount /etc/passwd. Since they would've
been using the host anyway, this is no different.
As far as the docker image itself, why does Hadoop need to provide an image?
Everything needed can be provided via the distributed cache or bind-mounts,
right? I don't see why we need a specialized image that is tied to Hadoop. You
just need an image with Java and Bash.
> Implement logic to keep docker image consistent in application that uses
> :latest tag
> ------------------------------------------------------------------------------------
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Eric Yang
> Assignee: Eric Yang
> Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch,
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch,
> YARN-9292.006.patch, YARN-9292.007.patch, YARN-9292.008.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation
> in node managers. If a image with latest tag is changed during containers
> launch. It might produce inconsistent results between nodes. This is surfaced
> toward end of development for YARN-9184 to keep docker image consistent
> within a job. One of the ideas to keep :latest tag consistent for a job, is
> to use docker image command to figure out the image id and use image id to
> propagate to rest of the container requests. There are some challenges to
> overcome:
> # The latest tag does not exist on the node where first container starts.
> The first container will need to download the latest image, and find image
> ID. This can introduce lag time for other containers to start.
> # If image id is used to start other container, container-executor may have
> problems to check if the image is coming from a trusted source. Both image
> name and ID must be supply through .cmd file to container-executor. However,
> hacker can supply incorrect image id and defeat container-executor security
> checks.
> If we can over come those challenges, it maybe possible to keep docker image
> consistent with one application.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]