[ 
https://issues.apache.org/jira/browse/YARN-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258594#comment-15258594
 ] 

Daniel Templeton commented on YARN-4122:
----------------------------------------

Seems like a generally reasonable approach.  From the SLURM lists, it looks 
like prior to CUDA 7, the environment variable was not working correctly:

https://devtalk.nvidia.com/default/topic/512869/cuda-accessing-all-devices-even-those-which-are-blacklisted/?offset=2

This design will probably also have to adjust for the work being done in 
YARN-4726.

In the doc you say that YARN is currently providing you GPU isolation.  How are 
you making that work?

> Add support for GPU as a resource
> ---------------------------------
>
>                 Key: YARN-4122
>                 URL: https://issues.apache.org/jira/browse/YARN-4122
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Jun Gong
>            Assignee: Jun Gong
>         Attachments: GPUAsAResourceDesign.pdf
>
>
> Use [cgroups 
> devcies|https://www.kernel.org/doc/Documentation/cgroups/devices.txt] to 
> isolate GPUs for containers. For docker containers, we could use 'docker run 
> --device=...'.
> Reference: [SLURM Resources isolation through 
> cgroups|http://slurm.schedmd.com/slurm_ug_2011/SLURM_UserGroup2011_cgroups.pdf].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to