[
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012185#comment-17012185
]
Eric Badger commented on YARN-9292:
-----------------------------------
Hey [~eyang], thanks for the patch! It looks like this patch only applies to
native services and that any client that wants to solve this issue will have to
solve it themselves. I don't think we can get around this issue unless we want
the RM to do the image sha256 hash query. And that sounds like a bad idea. But
I think it makes sense to do this for native services at least.
{noformat}
+
+ @GET
+ @Path("/container/{id}/docker/images/{name}")
+ @Produces({ MediaType.APPLICATION_JSON + "; " + JettyUtils.UTF_8,
+ MediaType.APPLICATION_XML + "; " + JettyUtils.UTF_8 })
+ public String getImageId(@PathParam("id") String id,
+ @PathParam("name") String name) {
+ DockerImagesCommand dockerImagesCommand = new DockerImagesCommand();
+ dockerImagesCommand = dockerImagesCommand.getSingleImageStatus(name);
+ PrivilegedOperationExecutor privOpExecutor =
+ PrivilegedOperationExecutor.getInstance(this.nmContext.getConf());
+ try {
+ String output = DockerCommandExecutor.executeDockerCommand(
+ dockerImagesCommand, id, null, privOpExecutor, false, nmContext);
+ String[] ids = output.substring(1, output.length()-1).split(" ");
+ String result = name;
+ for (String image : ids) {
+ String[] parts = image.split("@");
+ if (parts[0].equals(name.substring(0, parts[0].length()))) {
+ result = image;
+ }
+ }
+ return result;
+ } catch (ContainerExecutionException e) {
+ return "latest";
+ }
+ }
}
{noformat}
Doesn't the container know what image it was started with in its environment?
Why do we need to run a docker command here? If we don't care about the
container and just want to know what the sha of the image:tag is, then I agree
with [~csingh] that we don't need to use the containerId.
And if we do need to run a docker command, the for loop will give us the last
sha256 associated with that image name. But if there are many, couldn't that
not be the correct one?
{noformat}
+ Collection<org.apache.hadoop.yarn.service.api.records.Component>
{noformat}
I think we should import this instead of including the full path
{noformat}
+ if (compSpec.getArtifact()!=null && compSpec.getArtifact()
+ .getType()==TypeEnum.DOCKER) {
{noformat}
Spacing issues on the operators.
{noformat}
+ public static final String DOCKER_IMAGE_REGEX =
"^(([a-zA-Z0-9.-]+)(:\\d+)?/)?([a-z0-9_./-]+)(:[\\w.-]+)?$";
+ private static final String DOCKER_IMAGE_DIGEST_REGEX =
+ "^(([a-zA-Z0-9.-]+)(:\\d+)?/)?([a-z0-9_./-]+)(@sha256:)([a-f0-9]{6,64})";
{noformat}
The first part of both of these regexes is identical. I think we should create
a subregex and append to it to avoid having to make changes in multiple places
in the future. One if the image followed by a tag and the other is an image
followed by a sha. Should be easy to do.
{noformat}
@@ -1771,11 +1779,29 @@ int get_docker_images_command(const char *command_file,
const struct configurati
if (ret != 0) {
goto free_and_exit;
}
+ ret = add_to_args(args, "-f");
+ if (ret != 0) {
+ goto free_and_exit;
+ }
+ ret = add_to_args(args, "{{.RepoDigests}}");
+ if (ret != 0) {
+ goto free_and_exit;
+ }
+ } else {
+ ret = add_to_args(args, DOCKER_IMAGES_COMMAND);
+ if (ret != 0) {
+ goto free_and_exit;
+ }
+ ret = add_to_args(args, "--format={{json .}}");
+ if (ret != 0) {
+ goto free_and_exit;
+ }
+ ret = add_to_args(args, "--filter=dangling=false");
+ if (ret != 0) {
+ goto free_and_exit;
+ }
{noformat}
{noformat}
[ebadger@foo ~]$ sudo docker images --format={{json .}} --filter=dangling=false
Template parsing error: template: :1: unclosed action
[ebadger@foo ~]$ docker --version
Docker version 1.13.1, build 4ef4b30/1.13.1
{noformat}
The else clause syntax doesn't seem to work for me. Did I do something wrong?
This patch assumes that the client can access the Docker Registry. I'm not
super familiar with native services, but I imagine this client runs on a
gateway node somewhere outside of the cluster itself. With that, I imagine it
is possible that the cluster itself can access the Docker Registry while the
client can't. Or the Registry could require credentials to access it. Should we
make this feature optional to get around those error cases? Another possible
solution is to have the AM get the sha256 hash of the image that it is running
in and then passing that sha to all of the containers that it starts. This
would move the query into the Hadoop cluster itself.
> Implement logic to keep docker image consistent in application that uses
> :latest tag
> ------------------------------------------------------------------------------------
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Eric Yang
> Assignee: Eric Yang
> Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch,
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch,
> YARN-9292.006.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation
> in node managers. If a image with latest tag is changed during containers
> launch. It might produce inconsistent results between nodes. This is surfaced
> toward end of development for YARN-9184 to keep docker image consistent
> within a job. One of the ideas to keep :latest tag consistent for a job, is
> to use docker image command to figure out the image id and use image id to
> propagate to rest of the container requests. There are some challenges to
> overcome:
> # The latest tag does not exist on the node where first container starts.
> The first container will need to download the latest image, and find image
> ID. This can introduce lag time for other containers to start.
> # If image id is used to start other container, container-executor may have
> problems to check if the image is coming from a trusted source. Both image
> name and ID must be supply through .cmd file to container-executor. However,
> hacker can supply incorrect image id and defeat container-executor security
> checks.
> If we can over come those challenges, it maybe possible to keep docker image
> consistent with one application.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]