Breaking News

Journey to Deep Learning: Cuda GPU passthrough to a LXC container

Journey to Deep Learning: Cuda GPU passthrough to a LXC container

Reboot the server.

Open a new commandline

$ nvidia-smi

should give you a similar output and create the device in /dev/*:

Note the cgroup 195 for those nodes, we will need to allow read-write access to the containers to those cgroups.
(Remember what I said about nogroup:nobody being nonsense, it’s just a cgroup parameter issue)

Lastly we need the cgroup for CUDA, we can get it by loading the module:

$ modprobe nvidia-uvm
$ ls /dev/nvidia* -l
crw-rw-rw- 1 root root 243, 0 Jan 16 02:20 /dev/nvidia-uvm
crw-rw-rw- 1 root root 195, 0 Jan 16 02:16 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Jan 16 02:16 /dev/nvidiactl

cgroup is 243.
Please note that from time to time, it changed after I installed some packages.
If CUDA stopped at one point that may be the place to check.

Last step – configuring the LXC container

Your container config will be in /etc/pve/lxc
If you container ID is 100, open the 100.conf file

Add the lines after the GPU Passthrough comment
This will map the card from host to the container and allow access via the cgroup lines.

# Deep Learning Container (CUDA, cuDNN, OpenCL support)

arch: amd64
cpulimit: 8
cpuunits: 1024
hostname: MachineLearning
memory: 16384
net0: bridge=vmbr0,gw=192.168.1.1,hwaddr=36:39:64:66:36:66,ip=192.168.1.200/24,name=eth0,type=veth
onboot: 0
ostype: archlinux
rootfs: local-lvm:vm-400-disk-1,size=192G
swap: 16384
unprivileged: 1

# GPU Passthrough config
lxc.cgroup.devices.allow: c 195:* rwm
lxc.cgroup.devices.allow: c 243:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file

Restart your container

You should now have the following on the host:

$ /dev/nvidia* -l
crw-rw-rw- 1 root root 243, 0 Jan 16 21:05 /dev/nvidia-uvm
crw-rw-rw- 1 root root 195, 0 Jan 16 21:05 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Jan 16 21:05 /dev/nvidiactl

And on the container :

$ ls /dev/nvidia* -l
-rw-r–r– 1 root root 0 16.01.2017 20:11 /dev/nvidia-modeset
crw-rw-rw- 1 nobody nobody 243, 0 16.01.2017 20:05 /dev/nvidia-uvm
-rw-r–r– 1 root root 0 16.01.2017 20:11 /dev/nvidia-uvm-tools
crw-rw-rw- 1 nobody nobody 195, 0 16.01.2017 20:05 /dev/nvidia0
crw-rw-rw- 1 nobody nobody 195, 255 16.01.2017 20:05 /dev/nvidiactl

Install the following oi your container:
– CUDA
– cuDNN
– nvidia-smi / nvidia-utils
– cnmem
You do not need to install the NVIDIA driver unless your distro forces you too.

Here are my NVIDIA / CUDA packages on my Arch container

$ pacman -Qs nvidia
local/cuda 8.0.44-3
NVIDIA’s GPU programming toolkit
local/libcudnn 5.1.5-1
NVIDIA CUDA Deep Neural Network library
local/libvdpau 1.1.1-2
Nvidia VDPAU library
local/libxnvctrl 375.26-1
NVIDIA NV-CONTROL X extension
local/nvidia-settings 375.26-1
Tool for configuring the NVIDIA graphics driver
local/nvidia-utils 375.26-2
NVIDIA drivers utilities
local/opencl-nvidia 375.26-2
OpenCL implemention for NVIDIA
local/pycuda-headers 2016.1.2-6
Python wrapper for Nvidia CUDA
local/python-pycuda 2016.1.2-6
Python wrapper for Nvidia CUDA
$ pacman -Qs cuda
local/cnmem 1.0.0-1
A simple memory manager for CUDA designed to help Deep Learning frameworks manage memory
local/cuda 8.0.44-3
NVIDIA’s GPU programming toolkit
local/libcudnn 5.1.5-1
NVIDIA CUDA Deep Neural Network library
local/pycuda-headers 2016.1.2-6
Python wrapper for Nvidia CUDA
local/python-pycuda 2016.1.2-6
Python wrapper for Nvidia CUDA

You can now test with nvidia-smi that the GPU is properly passed through.
nvidia-smi / nvidia-utils must be of the matching version as the host nvidia driver (in my case 375.26)

I recommend to block the updates of nvidia/cuda related stuff on the host and the container to keep packages in sync.

Now you can configure your environment (Numpy, Pandas, Openblas, MKL, Scikit-learn, Jupyter, Theano, Keras, Lasagne, Torch, TensorFlow, YouNameIt).

For example using the script on Theano page you can confirm GPU acceleration with cuDNN:

$ python gpu_test.py
Using gpu device 0: GeForce GTX 1070 (CNMeM is enabled with initial size: 95.0% of memory, cuDNN 5105)
/usr/lib/python3.6/site-packages/theano/sandbox/cuda/init.py:600: UserWarning: Your cuDNN version is more recent than the one Theano officially supports. If you see any problems, try updating Theano or downgrading cuDNN to version 5.
warnings.warn(warn)
[GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>), HostFromGpu(GpuElemwise{exp,no_inplace}.0)]
Looping 1000 times took 0.294226 seconds
Result is [ 1.23178029 1.61879349 1.52278066 …, 2.20771813 2.29967761
1.62323296]
Used the gpu

 

Now you’re set ! Happy deep learning.

Pages: 1 2 3 4 5 6 7 8



Related Articles

High performance tensor library in Nim

Toward a (smoking !) high performance tensor library in Nim Forewords In April I started Arraymancer, yet another tensor library™,

Data Science Bowl 2017 – Space-Time tricks

Here we are again for the third post on my journey to Deep Learning. Like all super-heroes, the data scientist

Predicting apartment interest from listing with structured data, free text, geolocalization, time data and images

3 months ago, Data Science competition website Kaggle published a challenge from Two Sigma, an investment fund, and Renthop, a