[
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014600#comment-17014600
]
Eric Badger commented on YARN-9292:
-----------------------------------
{quote}
The node might have an old Docker image on it. It would be nice to get the
image information from the registry and only fall back to the local node's
version if the registry lookup fails. An indirect way to do this would be to do
a docker pull} before calling {{docker images.
The same can be argued for people who do not want to automatic pulling of
docker image to latest. As the result, there is a flag implemented in
YARN-9184. The flag decides if it will be base on local latest or repository
latest. This change should work in combination with YARN-9184.
{quote}
Agreed on your point with YARN-9184. I thought of that as well, but since the
AM isn't running inside of a Docker container, the image wouldn't have been
pulled to that node before the task is run, right? That's my concern here.
{quote}
If we can hit the docker registry directly via its REST API then we won't need
to invoke the container-executor at all and we can avoid this problem. This
looks like it should be fairly trivial, but I don't know how much more
difficult secure registries would be.
We don't contact docker registry directly nor we have code to conect secure
docker registry. I think it is too risky to contact the registry directly
because the registry could be a private registry defined in user's docker
config.json. It would be going down a rabbit hole to follow this path.
{quote}
I imagined that would be the answer here. Fair enough.
{quote}
By using either hash is fine, they will result in the same image. It is
somewhat fuzzy because they are alias of one another.
{quote}
{noformat}
[ebadger@foo bin]$ sudo docker images | grep centos
docker.io/centos 7
5e35e350aded 2 months ago 203 MB
docker.io/centos latest
0f3e07c0138f 3 months ago 220 MB
[ebadger@foo bin]$ sudo docker inspect image centos -f "{{.RepoDigests}}"
Error: No such object: image
{noformat}
Docker must have changed a bunch since the last supported release from RedHat
in RHEL 7 (1.13). The command you ran doesn't even work for my version of
Docker.
{noformat}
[ebadger@foo bin]$ sudo docker images | grep centos
docker.io/centos 7
5e35e350aded 2 months ago 203 MB
docker.io/centos latest
0f3e07c0138f 3 months ago 220 MB
[ebadger@foo bin]$ sudo docker image inspect centos -f "{{.RepoDigests}}"
[docker.io/centos@sha256:f94c1d992c193b3dc09e297ffd54d8a4f1dc946c37cbeceb26d35ce1647f88d9]
{noformat}
If I switch {{inspect image}} to {{image inspect}} I get a similar output to
yours, but I only get a single sha. Reading around on the internet, it looks
like Docker takes the manifest sha and then recalculates the digest with some
other stuff added on (maybe the tag data?) to get a new digest. I'm worried
that this could break if we randomly choose the last sha. For example, maybe
centos:7 is installed everywhere, but centos:latest is only installed on this
one node by accident. If we grab the centos:latest sha, it won't work on the
rest of the nodes in the cluster because the sha won't match the tag of the
image on those nodes, even though they have the same manifest hash. Or maybe it
only does the check based on the manifest hash. I can't seem to reproduce this
with my version of Docker, so I can't test out what actually happens.
{quote}
Maybe need to upgrade the docker version. The output appears like this on my
system:
{quote}
Yea I'm not sure how to deal with this. Docker seems to have broken things (or
added new things). I know Docker is a fast-moving technology, but RHEL 7 is
basically stuck on 1.13.1 at this point because of licensing issues.
{quote}
No, this rest api is secured by SPNEGO authentication, same as rest of node
manager rest api. HttpUtil.connect handles Kerberos negotiation.
{quote}
Ok cool. Thanks for the explanation.
> Implement logic to keep docker image consistent in application that uses
> :latest tag
> ------------------------------------------------------------------------------------
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Eric Yang
> Assignee: Eric Yang
> Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch,
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch,
> YARN-9292.006.patch, YARN-9292.007.patch, YARN-9292.008.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation
> in node managers. If a image with latest tag is changed during containers
> launch. It might produce inconsistent results between nodes. This is surfaced
> toward end of development for YARN-9184 to keep docker image consistent
> within a job. One of the ideas to keep :latest tag consistent for a job, is
> to use docker image command to figure out the image id and use image id to
> propagate to rest of the container requests. There are some challenges to
> overcome:
> # The latest tag does not exist on the node where first container starts.
> The first container will need to download the latest image, and find image
> ID. This can introduce lag time for other containers to start.
> # If image id is used to start other container, container-executor may have
> problems to check if the image is coming from a trusted source. Both image
> name and ID must be supply through .cmd file to container-executor. However,
> hacker can supply incorrect image id and defeat container-executor security
> checks.
> If we can over come those challenges, it maybe possible to keep docker image
> consistent with one application.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]