[ 
https://issues.apache.org/jira/browse/YARN-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498199#comment-16498199
 ] 

Eric Yang commented on YARN-8220:
---------------------------------

[~sunilg] Thank you for the patch, a couple suggestions:

1. Avoid using bash style launch command.  Although this is kind of working, 
but it greatly improves security and readability to use ENTRYPOINT, and CMD in 
Dockerfile.  For example:

{code}
WORKDIR /test/models/tutorials/image/cifar10_estimator 
ENTRYPOINT ["/usr/bin/python", "cifar10_main.py"]
CMD ["--data-dir=hdfs:///tmp/cifar-10-data"]
CMD ["--job-dir=hdfs:///tmp/cifar-10-jobdir"]
CMD ["--train-steps=10000"]
CMD ["--eval-batch-size=16"]
CMD ["--train-batch-size=16"]
CMD ["--sync"]
CMD ["--num-gpus=2"]
{code}

This simplifies yarnfile, and prevent to run the script in wrong directory if 
working directory doesn't exist.

2. It might be good to show case some yarnfile features:

{code}
{
..
  "configuration": {
    "env": {
      
"YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS":"/etc/hadoop/conf:/etc/hadoop/conf:ro",
      "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE":"true"
    }
  }
..
}
{code}

This helps to show case how to mount configuration files from host disks, and 
use ENTRYPOINT support.

3. Downloading source code from individual github contributors might be risky 
and prone to break.  If the source is small enough and donated to Apache, it 
would be better to host them locally.

> Running Tensorflow on YARN with GPU and Docker - Examples
> ---------------------------------------------------------
>
>                 Key: YARN-8220
>                 URL: https://issues.apache.org/jira/browse/YARN-8220
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn-native-services
>            Reporter: Sunil Govindan
>            Assignee: Sunil Govindan
>            Priority: Critical
>         Attachments: YARN-8220.001.patch
>
>
> Tensorflow could be run on YARN and could leverage YARN's distributed 
> features.
> This spec fill will help to run Tensorflow on yarn with GPU/docker



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to