Breaking News

Journey to Deep Learning: Cuda GPU passthrough to a LXC container

Journey to Deep Learning: Cuda GPU passthrough to a LXC container

Step 2 – Host GPU configuration

Let’s go to Proxmox command line

There are two ways, either click on one of the Shell button.

Or SSH in the server (which is enabled by default)

We will follow the instructions on Debian wiki for Nvidia and adapt them for our case.

Open /etc/apt/sources.list with your favorite editor, nano or vi. If you never used vi, this is the wrong time to start.

$ nano /etc/apt/sources.list

Add PVE (Proxmox) free repository, it’s needed to get the kernel headers to compile the nvidia module.
Add Jessie backport to get the latest nvidia driver

# security updates
deb http://security.debian.org jessie/updates main contrib

# PVE pve-no-subscription repository provided by proxmox.com,
# NOT recommended for production use
deb http://download.proxmox.com/debian jessie pve-no-subscription

# jessie-backports
deb http://httpredir.debian.org/debian jessie-backports main contrib non-free

Save the file.

Load the package listing
Update all packages to latest version (especially the kernel)

$ apt-get update
$ apt-get dist-upgrade

If kernel was updated, reboot

$ shutdown -r now

Reopen a console
Verify your kernel version
Verify available headers

$ uname -r
$ apt-cache search pve-header

Install the corresponding header, which should be the latest since you dist-upgraded and restarted.

$ apt-get install pve-headers-4.4.35-2-pve

Install nvidia driver from backport

$ apt-get install -t jessie-backports nvidia-driver

Every mandatory packages is set, I suggest to install monitoring tools, namely nvidia-smi, i7z, htop, iotop and lm-sensors:
– i7z is a monitoring tool for Intel CPU (temperature, freq)
– nvidia-smi for NVIDIA GPU, that is actually what we will use to confirm GPU setup
– htop is a ncurses interface top (CPU, RAM monitoring)
– iotop to monitor disk access speed and potential contention

$ apt-get install i7z nvidia-smi htop iotop

Last thing before the last restart fo your server life:
Nvidia driver will create special files in /dev/* called:
– /dev/nvidia0 : corresponding to the GPU (second GPU would get /dev/nvidia1)
– /dev/nvidiactl : not really sure what it’s for, control?
– /dev/nvidia-uvm : the CUDA driver

and sometimes you will get:
– /dev/nvidia-uvm-tools
– /dev/nvidia-modeset

Those files are created when the nvidia module is loaded (when an application access the GPU).
NVIDIA module must be loaded in the host before being usable in a container.
So to make sure it’s loaded at boot, edit /etc/modules-load.d/modules.conf

$ nano /etc/modules-load.d/modules.conf

Add nvidia and nvidia_uvm

# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with “#” are ignored.
nvidia
nvidia_uvm

Then update the initramfs so it takes the new module into account

update-initramfs -u

Then, for some reason nvidia and nvidia_uvm do not automatically create the node in /dev/*
It only create them when X server or nvidia-smi is used so add the following in /etc/udev/rules.d/70-nvidia.rules:

# /etc/udev/rules.d/70-nvidia.rules

# Create /nvidia0, /dev/nvidia1 … and /nvidiactl when nvidia module is loaded
KERNEL==”nvidia”, RUN+=”/bin/bash -c ‘/usr/bin/nvidia-smi -L && /bin/chmod 666 /dev/nvidia*'”

# Create the CUDA node when nvidia_uvm CUDA module is loaded
KERNEL==”nvidia_uvm”, RUN+=”/bin/bash -c ‘/usr/bin/nvidia-modprobe -c0 -u && /bin/chmod 0666 /dev/nvidia-uvm*'”

Pages: 1 2 3 4 5 6 7 8



Related Articles

Journey to Deep Learning #2: Don’t fight the wrong fights

2 months ago, I took my courage and threw it at Machine Learning. Machine Learning, at least the supervised learning,

Predicting apartment interest from listing with structured data, free text, geolocalization, time data and images

3 months ago, Data Science competition website Kaggle published a challenge from Two Sigma, an investment fund, and Renthop, a

High performance tensor library in Nim

Toward a (smoking !) high performance tensor library in Nim Forewords In April I started Arraymancer, yet another tensor library™,