[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17017364#comment-17017364
 ] 

Eric Yang commented on YARN-9292:
---------------------------------

[~ebadger] {quote}I do have some questions on why we can't move the AM into a 
docker container though. What is it that is special about the AM that we need 
to run it directly on the host? What does it depend on the host for? We should 
be able to use the distributed cache to localize any libraries/jars that it 
needs. And as far as nscd/sssd, those can be bind-mounted into the container 
via configs. If they don't have nscd/sssd then they can bind-mount /etc/passwd. 
Since they would've been using the host anyway, this is no different.{quote}

YARN native service was a code merge from Apache Slider, and it was developed 
to run in YARN container directory like mapreduce tasks.  If the AM docker 
image is a mirror image of the host system, AM can run in a docker container.  
AM code still depends on all Hadoop client libraries, Hadoop configuration and 
Hadoop environment variables.

{quote}As far as the docker image itself, why does Hadoop need to provide an 
image? Everything needed can be provided via the distributed cache or 
bind-mounts, right? I don't see why we need a specialized image that is tied to 
Hadoop. You just need an image with Java and Bash.{quote}

>From 10,000 feet point of view, yes, AM only requires Java and Bash.  If 
>Hadoop provides the image, our users can deploy the image without worry about 
>how to create a docker image that mirrors the host structure.  Without Hadoop 
>supplying image and agreed upon image format.  It is up to the system admin's 
>interpretation of where Hadoop client configuration and client binaries are 
>located.  He/she can run the job with ENTRY point mode disabled and bind mount 
>Hadoop configuration and binaries.  As I recall, this is the less secure 
>approach to run the container because container requires to bind mount 
>writable Hadoop log directory to the container for launcher script to write 
>output.  This is a hassle and no container benefit. This method still exposes 
>host level environment and binaries to container.  There are 5 people on 
>planet Earth that knows how to wire this together, but unlikely to suggest 
>this approach.

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-9292
>                 URL: https://issues.apache.org/jira/browse/YARN-9292
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>            Priority: Major
>         Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch, 
> YARN-9292.006.patch, YARN-9292.007.patch, YARN-9292.008.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to