Installing the NVidia driver on your DataCrunch.io server

If you need only the driver without CUDA, you can follow these steps.

Step 1

Here we will be installing CUDA 11.0 for Ubuntu 20.04. We start by obtaining the file;

wget https://us.download.nvidia.com/tesla/450.51.06/NVIDIA-Linux-x86_64-450.51.06.run

Before installing, we will need to install some dependencies:

sudo apt update

sudo apt install build-essential gcc-multilib dkms

Step 2a:

Next, we make the file executable and run it:

sudo chmod +x NVIDIA-Linux-x86_64-450.51.06.run

sudo ./NVIDIA-Linux-x86_64-450.51.06.run

Follow the instructions given by the installer. You can choose to use DKMS, the 32-bit files are not needed.

If it gives the following error: 

"For some distributions, Nouveau can be disabled by adding a file in the modprobe configuration directory.  Would you like nvidia-installer to attempt to create this modprobe file for you?"

Select 'Yes' and continue with step 2b. If it did not give an error, proceed with step 3.

Step 2b: (only needed if the installer failed)

The installer will fail if the nouveau driver was active (no worries!). Just follow these extra steps:

sudo bash -c "echo blacklist nouveau > /etc/modprobe.d/blacklist-nvidia-nouveau.conf"
sudo bash -c "echo options nouveau modeset=0 >> /etc/modprobe.d/blacklist-nvidia-nouveau.conf"

sudo update-initramfs -u

sudo reboot -h now

After rebooting we start the installer again:

sudo ./NVIDIA-Linux-x86_64-450.51.06.run

You can choose to use DKMS, the 32-bit files are not needed.

Step 3:

After installation, you can check the output of “nvidia-smi”, you should see your GPU’s, driver version and CUDA version. If all is looking good, we will modify our startup script;

sudo nano /etc/rc.local

paste:

#!/bin/bash
nvidia-smi -pm 1
nvidia-smi -e 0
exit 0

/etc/rc.local should look like this:

If you are wondering what the script does;

“#!/bin/bash”: required to let the shell know to use bash. (this is not a normal comment, not a optional line)

“nvidia-smi -pm 1”: This will enable persistence mode to keep the driver loaded (which will increase the speed of some actions).

“nvidia-smi -e 0”: This will disable error correcting on the memory of the GPU. This is safe to do for most applications and will allow using more GPU memory.

“exit 0”: Save and close the script.

Let’s make the file executable and reboot:

sudo chmod +x /etc/rc.local

sudo /etc/rc.local

And that’s it, you are ready to use your GPU’s! You can confirm the status of persistence mode and ecc by running 'nvidia-smi'

Arrow-up