[
https://issues.apache.org/jira/browse/YARN-8927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16674660#comment-16674660
]
Zhankun Tang commented on YARN-8927:
------------------------------------
[~eyang] , for the splitting configuration by run and pull. That's just a
suggestion. I agree that we should think more and establish a good design in
point 1 to avoid a revisit.
How the idea of the split into pull and run comes up might be helpful for your
reference. When I think how "YARN-3854" would decide what repo it can pull from
and run. Something is unclear for the local image settings
"docker.trusted.local.image". *Now I seem to prefer the
"docker.trusted.local.image" is a white-list.*
Consider below scenario with boolean flag:
{code:java}
"docker.trusted.local.image" = false
"docker.trusted.registries" = "cmp1, library"
{code}
When a user request "cmp1/img1" or "centos:latest", YARN-3854 may download it
first if no local image because we trust "cmp1" and Docker hub. And then when
c-e wants to run the container, it should first check if this "cmp1/img1" is
really local.
If it is local before the YARN-3854, deny it because
"docker.trusted.local.image" is false. Else, allow it to run based on
privilege/mount white-list check result.
This seems to require YARN to maintains a list of local images in advance in
java layer because c-e is not long-running.
Although passing the list to c-e and let c-e do the check is possible, this
seems unsmooth or complex. And we need to handle NM restart and load back the
original local images names.
So it seems the "_docker.trusted.local.image_" should be a white-list to avoid
above complexity. And the name can be like:
{code:java}
"docker.trusted.local.images" = "cmp1/img1,centos"
"docker.trusted.registries" = "cmp1,library"
{code}
But the above configuration still seems not that straightforward to me. So the
below configurations comes up in my mind:
{code:java}
"docker.pull.trusted.registries" = "cmp1,library"
"docker.run.trusted.registries" = "cmp1,library"
{code}
Please correct me if I missed something important. I've no strong opinion on
either configuration. Any thoughts? [~eyang] , [~ebadger]
> Better handling of "docker.trusted.registries" in container-executor's
> "trusted_image_check" function
> -----------------------------------------------------------------------------------------------------
>
> Key: YARN-8927
> URL: https://issues.apache.org/jira/browse/YARN-8927
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Zhankun Tang
> Assignee: Zhankun Tang
> Priority: Major
> Labels: Docker
>
> There are some missing cases that we need to catch when handling
> "docker.trusted.registries".
> The container-executor.cfg configuration is as follows:
> {code:java}
> docker.trusted.registries=tangzhankun,ubuntu,centos{code}
> It works if run DistrubutedShell with "tangzhankun/tensorflow"
> {code:java}
> "yarn ... -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=tangzhankun/tensorflow
> {code}
> But run a DistrubutedShell job with "centos", "centos[:tagName]", "ubuntu"
> and "ubuntu[:tagName]" fails:
> The error message is like:
> {code:java}
> "image: centos is not trusted"
> {code}
> We need better handling the above cases.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]