NVIDIA* CUDA Toolkit

NVIDIA is a manufacturer of graphics processing units (GPU), also known as graphics cards.

CUDA is a parallel computing platform and application programming interface model created by NVIDIA. It allows software developers and software engineers to use a CUDA-enabled graphics processing unit

These instructions show how to install the CUDA Toolkit on Clear Linux OS after the proprietary NVIDIA drivers have been installed.

Note

Software installed outside of swupd is not updated with Clear Linux OS updates and must be updated and maintained manually.

Prerequisites

Compatibility

Check compatibility of NVIDIA components

To install the appropriate NVIDIA CUDA Toolkit version, it is important to understand the compute capability and compatible driver versions of your NVIDIA hardware.

Information about NVIDIA compute capability, driver, and toolkit compatibility can be found at: https://developer.nvidia.com/cuda-gpus and https://docs.nvidia.com/deploy/cuda-compatibility/

Check GCC compatibility

Note

This is only required for the development or compilation of CUDA applications. It is not required to run pre-built applications that have a dependency on CUDA.

From the NVIDIA documentation:

The CUDA development environment relies on tight integration with the host development environment, including the host compiler and C runtime libraries, and is therefore only supported on distribution versions that have been qualified for this CUDA Toolkit release.

Refer to the NVIDIA documentation on CUDA system requirements for the latest kernel and compiler compatibility.

For example, CUDA 10.1 on a system with the latest Linux kernel requires GCC7, which is older than the default GCC version for Clear Linux OS.

Install the compatible version of GCC, if required:

  1. Install the bundle with the appropriate GCC version.

    sudo swupd bundle-add c-extras-gcc7
    
  2. Create the directory /usr/local/cuda/bin:

    sudo mkdir -p /usr/local/cuda/bin
    
  3. Add symlinks to the older GCC version in the /usr/local/cuda/bin directory. This will cause the older version of GCC to be used when /usr/local/cuda/bin is in the $PATH environment variable.

    sudo ln -s /usr/bin/gcc7 /usr/local/cuda/bin/gcc
    sudo ln -s /usr/bin/g++7 /usr/local/cuda/bin/g++
    

Downloading and Installation

Download the NVIDIA CUDA Toolkit

  1. Go to the NVIDIA CUDA downloads website to get the latest CUDA Toolkit. If an older version of the CUDA Toolkit is required, go to the CUDA Toolkit Archive.

    Choose the following settings and click Download.

    • Operating System: Linux
    • Architecture: x86_64
    • Distribution: any
    • Version: any
    • Installer Type: runfile(local)
  2. Open a terminal and navigate to where the cuda_<VERSION>_linux.run file was saved. In this example, it was saved in the Downloads folder.

    cd ~/Downloads/
    
  3. Make the cuda_<VERSION>_linux.run file executable:

    chmod +x cuda_<VERSION>_linux.run
    

Install the NVIDIA CUDA Toolkit

The NVIDIA CUDA installer will be directed to install files under /opt/cuda as much as possible to keep its contents isolated from the rest of the Clear Linux OS files under /usr.

The CUDA installer automatically creates a symbolic link that allows the CUDA Toolkit to be accessed from /usr/local/cuda regardless of where it was installed.

  1. Configure the dynamic linker to look for and cache shared libraries under /opt/cuda/lib64 where the NVIDIA installer will place libraries.

    sudo mkdir -p /etc/ld.so.conf.d
    echo "include /etc/ld.so.conf.d/*.conf" |  sudo tee --append /etc/ld.so.conf
    

    The CUDA installer will automatically create a file /etc/ld.so.conf.d/cuda-<VERSION>.conf

  2. Navigate into the directory where the NVIDIA installer was downloaded:

    cd ~/Downloads/
    
  3. Run the installer with the advanced options below:

    sudo ./cuda_<VERSION>_linux.run \
    --toolkit \
    --samples \
    --installpath=/opt/cuda \
    --no-man-page \
    --override \
    --silent
    
  4. Validate the CUDA Toolkit was installed by checking the NVIDIA CUDA compiler version:

    /usr/local/cuda/bin/nvcc --version
    

The CUDA Toolkit is now installed and can be used to compile and run CUDA applications.

Using the NVIDIA CUDA Toolkit

  1. Verify that the NVIDIA device characters files /dev/nvidia* exist and have the correct (0666) file permissions. The character devices should be automatically created on system with the NVIDIA driver loaded through X server, but will not be on systems that do not automatically load the NVIDIA driver.

    ls -l /dev/nvidia*
    
  2. If your system does not have the NVIDIA character devices created automatically, run the script from NVIDIA documentation with root privileges.

    Alternatively a setuid utility, nvidia-modprobe, can be compiled and installed to automatically create the device character files on-demand.

    wget https://download.nvidia.com/XFree86/nvidia-modprobe/nvidia-modprobe-<VERSION>.tar.bz2
    tar -xvf nvidia-modprobe-<VERSION>.tar.bz2
    cd nvidia-modprobe-<VERSION>/
    make
    sudo make install PREFIX=/usr/local/cuda/
    
  3. When the CUDA toolkit is needed, export PATH variables pointing to the CUDA directories. This will temporarily add CUDA files to the PATH and use the specified linked version of GCC for the terminal session.

    export PATH=/usr/local/cuda/bin:$PATH
    export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
    

Source code for CUDA sample located at /usr/local/cuda/NVIDIA_CUDA-<VERSION>_Samples. See the CUDA documentation on compiling samples to learn more.

Uninstalling

The NVIDIA drivers and associated software can be uninstalled and nouveau driver restored by:

  1. Run the sudo /usr/local/cuda/bin/cuda-uninstaller.
  2. Follow the prompts on the screen and reboot the system.

Debugging

  • The NVIDIA CUDA installer places logs under /tmp/cuda-installer.log.