[ 
https://issues.apache.org/jira/browse/YARN-8821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhankun Tang updated YARN-8821:
-------------------------------
    Description: 
GPU topology affects performance dramatically. There's been a discussion in 
YARN-7481. But we'd like to move related discussions here.

Please note that YARN-8851 will provide a pluggable device framework which can 
support plugin custom scheduler. And based on the framework, GPU plugin could 
have own topology scheduler. The proposed patch has a topology algorithm 
implemented as below:
 # When plugin inits, parse the output of "nvidia-smi topo -m" to build a hash 
map whose key is all pairs of GPUs and the value is the communication cost 
between the two. The map is like \{"0 - 1"=> 2, "0 - 2"=>4, ...} which means 
the minimum cost of GPU 0 to 1 is 2. The cost is set based on the connection 
type. Haven't considered CPU affinity or NUMA node yet.
 # And then it constructs a cost table which caches all combinations of GPUs 
and corresponding cost between them. The cost table is a map whose structure is 
like \{2=>{[0,1]=>2,..}, 3=>\{[0,1,2]=>10,..}, 4=>\{[0,1,2,3]=>18}}. The key of 
the map is the count of GPUs, the value of it is a map whose key is the 
combination of GPUs and the value is the calculated communication cost of the 
numbers of GPUs. The cost calculation algorithm is to sum all non-duplicate 
pairs of GPU's cost. For instance, the total cost of [0,1,2] GPUs are the sum 
of cost "0 - 1", "0 - 2" and "1 - 2". And each cost can get from the map built 
in step 1.
 # After the cache table is built, when allocating GPUs based on topology. We 
provide two policy which container can set through an environment variable "".

  was:
GPU topology affects performance dramatically. There's been a discussion in 
YARN-7481. But we'd like to move related discussions here.

Please note that YARN-8851 will provide a pluggable device framework which can 
support plugin custom scheduler. And based on the framework, GPU plugin could 
have own topology scheduler.


> GPU hierarchy/topology scheduling support
> -----------------------------------------
>
>                 Key: YARN-8821
>                 URL: https://issues.apache.org/jira/browse/YARN-8821
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Zhankun Tang
>            Assignee: Zhankun Tang
>            Priority: Major
>         Attachments: YARN-8821-trunk.001.patch
>
>
> GPU topology affects performance dramatically. There's been a discussion in 
> YARN-7481. But we'd like to move related discussions here.
> Please note that YARN-8851 will provide a pluggable device framework which 
> can support plugin custom scheduler. And based on the framework, GPU plugin 
> could have own topology scheduler. The proposed patch has a topology 
> algorithm implemented as below:
>  # When plugin inits, parse the output of "nvidia-smi topo -m" to build a 
> hash map whose key is all pairs of GPUs and the value is the communication 
> cost between the two. The map is like \{"0 - 1"=> 2, "0 - 2"=>4, ...} which 
> means the minimum cost of GPU 0 to 1 is 2. The cost is set based on the 
> connection type. Haven't considered CPU affinity or NUMA node yet.
>  # And then it constructs a cost table which caches all combinations of GPUs 
> and corresponding cost between them. The cost table is a map whose structure 
> is like \{2=>{[0,1]=>2,..}, 3=>\{[0,1,2]=>10,..}, 4=>\{[0,1,2,3]=>18}}. The 
> key of the map is the count of GPUs, the value of it is a map whose key is 
> the combination of GPUs and the value is the calculated communication cost of 
> the numbers of GPUs. The cost calculation algorithm is to sum all 
> non-duplicate pairs of GPU's cost. For instance, the total cost of [0,1,2] 
> GPUs are the sum of cost "0 - 1", "0 - 2" and "1 - 2". And each cost can get 
> from the map built in step 1.
>  # After the cache table is built, when allocating GPUs based on topology. We 
> provide two policy which container can set through an environment variable "".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to