Kai Zheng created YARN-6043:
-------------------------------
Summary: [HDL] Tensorflow on YARN
Key: YARN-6043
URL: https://issues.apache.org/jira/browse/YARN-6043
Project: Hadoop YARN
Issue Type: New Feature
Reporter: Kai Zheng
As discussed in the umbrella HADOOP-13944, we'd like to work and support Deep
Learning on Hadoop. As a beginning, we implemented a prototype running
Tensorflow on YARN. Preliminarily the work provides a tool yarn-tf allowing
users to submit and run a Tensorflow job (say mnist.py) in a YARN cluster. It
allocates and launches a Tensorflow cluster in YARN dynamically, executing the
job, and then destroys the cluster after the work is done. It doesn't require
Python and Tensorflow binary installations be done previously on YARN nodes (on
client host, Python is required if the job is written in Python). It doesn't go
in the Docker approach. Given an existing Hadoop cluster, it's pretty easy to
run a Tensorflow job using the provided yarn-tf.jar bundle (the TF core library
and our JNI wrapper) and yarn-tf tool.
In this jira we'll post our design documenting how we did it and the general
approach. The prototype work is under polishing and will be public here soon.
Filing this as unassigned as it's a team work. Your thoughts and feedback are
welcome.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]