Chen Qingcha created YARN-7481:
----------------------------------
Summary: Gpu locality support for Better AI scheduling
Key: YARN-7481
URL: https://issues.apache.org/jira/browse/YARN-7481
Project: Hadoop YARN
Issue Type: New Feature
Components: api, RM, yarn
Affects Versions: 2.7.2
Reporter: Chen Qingcha
Fix For: 2.7.2
We enhance Hadoop with GPU support for better AI job scheduling.
Currently, YARN-3926 also supports GPU scheduling, which treats GPU as
countable resource.
However, GPU placement is very important to deep learning job for better
efficiency.
For example, a 2-GPU job runs on gpu {0,1} could be faster than run on gpu {0,
7}, if GPU 0 and 1 are under the same PCI-E switch while 0 and 7 are not.
We add the GPU support to Hadoop 2.7.2 to enable GPU locality scheduling,
which support fine-grained GPU placement. A 64-bits bitmap is added to yarn
Resource, which indicates both GPU usage and locality information in a node (up
to 64 GPUs per node). '1' means available and '0' otherwise in the
corresponding position of the bit.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]