Software Defined Infrastructure, IBM Systems
----- Original message -----
From: "Huette, Antoine" <antoine.hue...@bechtle.com>
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Cc:
Subject: Re: [xcat-user] nvidia driver on stateless cluster
Date: Wed, Nov 21, 2018 6:45 PM
Hello Bin,I don't understand your question.Besides, I'm talking about the driver, not Cuda.Best RegardsAntoine Huette-------- Message d'origine --------De : Bin XA Xu <bx...@cn.ibm.com>Date : 20/11/2018 02:46 (GMT+01:00)À : xcat-user@lists.sourceforge.netCc : xcat-user@lists.sourceforge.netObjet : Re: [xcat-user] nvidia driver on stateless clusterxCAT only officially supports the package based installation for CUDA, would we know anything make you choose runfile to install NVIDIA driver?Bin XuHPC Software Development
Software Defined Infrastructure, IBM SystemsPhone: 86-010-82454067E-mail: bx...@cn.ibm.com----- Original message -----
From: "Huette, Antoine" <antoine.hue...@bechtle.com>
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Cc:
Subject: Re: [xcat-user] nvidia driver on stateless cluster
Date: Mon, Nov 19, 2018 10:57 PM
Hello FrankI was able to get the driver installed and working with the second option :)Thank you !Best RegardsAntoine Huette-------- Message d'origine --------De : "Heckes Frank (CI/OSB4)" <frank.hec...@de.bosch.com>Date : 16/11/2018 14:29 (GMT+01:00)À : xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Objet : Re: [xcat-user] nvidia driver on stateless clusterHello,
I suppose you mean a node equipped with NVIDIA GPU(s).
There’s one option I currently use to install the driver in image of a rhel/centos node.
On a node with the kernel-devel RPM of the target node installed (might be MN or a build host of sorts),
run the downloaded driver:
./NVIDIA-Linux-x86_64-390.87.run --add-this-kernel
The node don’t have to be the target node.
This will create a self extracting file customized with the kernel running on your target node. ./NVIDIA-Linux-x86_64-390.87-custom.run .
In case the kernel isn’t running on the ‘build’ node you can specify the kernel version and src dir via command-lineOptions (see –advanced-options output)
Now can start this version from a postscript. The file might be in a network FS share or inside the image and deleted afterwards by running:
NVIDIA-Linux-x86_64-390.87-custom.run –x; ./nvidia-install -s
You need to blacklist the noveau in the diskless boot before.
There’s another possibility to use dkms with the nvidia installer. You’d need to chroot (and bind /dev/, /proc/, sys) manually and run the installer with –dkms option.
Mit freundlichen Grüßen / Best regards
Frank Heckes
CI Operations - Server Services Sun Solaris, Linux (CI/OSB4)
frank.hec...@de.bosch.com
Von: Huette, Antoine <antoine.hue...@bechtle.com>
Gesendet: Freitag, 16. November 2018 12:36
An: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Betreff: [xcat-user] nvidia driver on stateless cluster
Hello,
On a stateless CentOS 7.5 cluster with Quadro GPUs, I need to install the Nvidia driver. I’m using the runfile downloaded from the Nvidia website.
What is the suggested procedure ? Is it better to install the driver in the osimage, or should I make the installer run when the nodes start ?
The problem I see with the first option is the fact that the driver checks if a GPU is present in the system, so I’m not sure if this method can work.
The problem with the second method is that, after trying it, it’s very difficult to have a working X server with a Gnome desktop. The driver installer needs the node to be in runlevel 3 (multi-user.target) but once it is installed, I need to switch to runlevel 5 (graphical.target) which almost never works. So far the only way I’ve found is by installing the driver manually on a freshly booted node, run nvidia-xconfig to fill the Xorg.conf file, and then restarting the gnome services.
Any help on this subject would be much appreciated ! 😊
Best regards,
Antoine Huette
HPC Engineerantoine.hue...@bechtle.com | 03.67.07.97.37/07.72.31.82.12 | bechtle.fr |
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user
_______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user