[ 
https://issues.apache.org/jira/browse/YARN-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-7224:
-----------------------------
    Attachment: YARN-7224.008.patch

Thanks [~sunilg] for comments,

bq. In assignGpus, do we also need to update the assigned gpus to container's 
resource mapping list ?
I would prefer to keep them in NMStateStore#storeAssignedResources, otherwise 
all new resource plugins need to implement such logics.

bq. In general dockerCommandPlugin.updateDockerRunCommand helps to update 
docker command for volume etc. However is its better to have an api named 
sanitize/verifyCommand in dockerCommandPlugin so that incoming/created command 
will validated and logged based on system parameters
I'm not quite sure about this, could you explain?

bq. Once a docker volume is created, when this volume will be cleaned or 
unmounted ? in case when container crashes or force stopping container from 
external docker commands etc
bq. With container upgrades or partially using GPU device for a timeslice of 
container lifetime, how volumes could be mounted/re-mounted ?
For the GPU docker integration, we don't need to do this. Because all launched 
containers will share the same docker volume, so we don't need to create the 
docker volume again and again. I agree that we may need this in the future. So 
I added one method (getCleanupDockerVolumeCommand) to DockerCommandPlugin 
interface.

bq. In GpuDevice, do we also need to add make (like nvidia with version etc ? )
We don't need it for now, we can add it in the future easily when required.

bq. In initializeWhenGpuRequested, we do a lazy initialization. However if 
docker end point is down(default port), this could cause delay in container 
launch. Do we need a health mechanism to get this data updated ?
To me this is same as docker daemon is down. And since containers will fail 
fast, so admin should be able to fix this issue. 

bq. Once docker volume is created, its better to dump the docker volume inspect 
o/p on created volume. Could help for debugging later.
I like this ideal, but considering size of this patch, can we do this in a 
follow up JIRA?

Attached ver.8 patch.

> Support GPU isolation for docker container
> ------------------------------------------
>
>                 Key: YARN-7224
>                 URL: https://issues.apache.org/jira/browse/YARN-7224
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: YARN-7224.001.patch, YARN-7224.002-wip.patch, 
> YARN-7224.003.patch, YARN-7224.004.patch, YARN-7224.005.patch, 
> YARN-7224.006.patch, YARN-7224.007.patch, YARN-7224.008.patch
>
>
> This patch is to address issues when docker container is being used:
> 1. GPU driver and nvidia libraries: If GPU drivers and NV libraries are 
> pre-packaged inside docker image, it could conflict to driver and 
> nvidia-libraries installed on Host OS. An alternative solution is to detect 
> Host OS's installed drivers and devices, mount it when launch docker 
> container. Please refer to \[1\] for more details. 
> 2. Image detection: 
> From \[2\], the challenge is: 
> bq. Mounting user-level driver libraries and device files clobbers the 
> environment of the container, it should be done only when the container is 
> running a GPU application. The challenge here is to determine if a given 
> image will be using the GPU or not. We should also prevent launching 
> containers based on a Docker image that is incompatible with the host NVIDIA 
> driver version, you can find more details on this wiki page.
> 3. GPU isolation.
> *Proposed solution*:
> a. Use nvidia-docker-plugin \[3\] to address issue #1, this is the same 
> solution used by K8S \[4\]. issue #2 could be addressed in a separate JIRA.
> We won't ship nvidia-docker-plugin with out releases and we require cluster 
> admin to preinstall nvidia-docker-plugin to use GPU+docker support on YARN. 
> "nvidia-docker" is a wrapper of docker binary which can address #3 as well, 
> however "nvidia-docker" doesn't provide same semantics of docker, and it 
> needs to setup additional environments such as PATH/LD_LIBRARY_PATH to use 
> it. To avoid introducing additional issues, we plan to use 
> nvidia-docker-plugin + docker binary approach.
> b. To address GPU driver and nvidia libraries, we uses nvidia-docker-plugin 
> \[3\] to create a volume which includes GPU-related libraries and mount it 
> when docker container being launched. Changes include: 
> - Instead of using {{volume-driver}}, this patch added {{docker volume 
> create}} command to c-e and NM Java side. The reason is {{volume-driver}} can 
> only use single volume driver for each launched docker container.
> - Updated {{c-e}} and Java side, if a mounted volume is a named volume in 
> docker, skip checking file existence. (Named-volume still need to be added to 
> permitted list of container-executor.cfg).
> c. To address isolation issue:
> We found that, cgroup + docker doesn't work under newer docker version which 
> uses {{runc}} as default runtime. Setting {{--cgroup-parent}} to a cgroup 
> which include any {{devices.deny}} causes docker container cannot be launched.
> Instead this patch passes allowed GPU devices via {{--device}} to docker 
> launch command.
> References:
> \[1\] https://github.com/NVIDIA/nvidia-docker/wiki/NVIDIA-driver
> \[2\] https://github.com/NVIDIA/nvidia-docker/wiki/Image-inspection
> \[3\] https://github.com/NVIDIA/nvidia-docker/wiki/nvidia-docker-plugin
> \[4\] https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to