[
https://issues.apache.org/jira/browse/YARN-8563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623970#comment-16623970
]
Wangda Tan commented on YARN-8563:
----------------------------------
Thanks [~liuxun323],
The primary purpose of this ticket is to avoid user building Docker image every
time.
Specifying a pre-baked image is easy, but customize an image is hard. IMO, data
scientist has requirements to update their programs frequently to do
experiments. So this unavoidably needs to change dependencies in some cases. I
spoke to some DS, many of them are not prefer to build Docker image by
themselves.
I agree with you that for many cases they can use a pre-baked image. But
providing an option to allow them specify dependency instead of find docker
file, and rebuild the docker image is definitely a cheaper solution for both of
DS and underlying system. So I would view this to be a combination of base
image + dependencies.
I also agree that specifying version of TF / Python may not help here since we
can name Docker image such as tf-1.8.0-python3:latest.
Another thing we haven't done is how to localize user's code to the training
environment. I don't think it is a good idea to ask user to put training code
into Docker image. Instead they can provide a path to zip on HDFS/S3 and YARN
can get it downloaded, unpacked.
> [Submarine] Support users to specify Python/TF package/version/dependencies
> for training job.
> ---------------------------------------------------------------------------------------------
>
> Key: YARN-8563
> URL: https://issues.apache.org/jira/browse/YARN-8563
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Wangda Tan
> Priority: Major
>
> YARN-8561 assumes all Python / Tensorflow dependencies will be packed to
> docker image. In practice, user doesn't want to build docker image. Instead,
> user can provide python package / dependencies (like .whl), Python and TF
> version. And Submarine can localize specified dependencies to prebuilt base
> Docker images.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]