Software Setup

This episode describes how to set up the programming environment for the CUDA exercises, including connecting to the training cluster, loading the necessary modules, and compiling and running GPU programs.

Objectives

  • Connect to the training cluster and set up the CUDA environment

  • Know how to compile CUDA C/C++ and CUDA Fortran programs

  • Know how to start and manage interactive GPU jobs

Instructor note

  • 15 min teaching

  • 15 min hands-on setup

Connecting to the training cluster

Connect to the cluster via SSH, replacing sca12345 with your assigned username:

$ ssh sca12345@training.hlrs.de

You should end up at the login node. Your terminal prompt will look similar to:

sca12345@cl7fr1:~$

Initial setup

On the cluster, initialise your account once by calling:

$ source /shared/akad-cuda/cuda_setup

This copies the exercise files to your home directory and sets up the environment.

Starting an interactive GPU job

Start an interactive job using one GPU. Only start one job at a time — the system may refuse to start another and complain if you already have one running.

$ ~/cuda_job

Once your job starts, always load the necessary compiler modules:

$ module load compiler/nvidia

You are now ready to compile and run GPU programs. Do not submit new jobs (i.e., do not execute ~/cuda_job again) while you already have a job running.

Compiling CUDA programs

$ nvcc -o program program.cu
$ nvcc -O3 -o program program.cu          # with optimisation
$ nvcc -g -G -o program program.cu        # with debug symbols
$ nvcc -lineinfo -o program program.cu    # for profiling

To link CUDA libraries (e.g., cuBLAS):

$ nvcc -lcublas -o program program.cu

Managing jobs

To check your running or queued jobs:

$ qstat

To stop/delete a job (replace the job ID with yours):

$ qdel 23794.cl7intern

When you are finished with your work, stop your interactive job:

$ exit

Non-interactive (batch) jobs

Some exercises (e.g., multi-GPU) require a full GPU node. For these, submit a batch job:

$ qsub job-multigpu.pbs

Output will be written to multi-gpu-job.o* (stdout) and multi-gpu-job.e* (stderr) after the job finishes.

Keypoints

  • Connect via SSH and run source /shared/akad-cuda/cuda_setup once to initialise

  • Start a GPU job with ~/cuda_job and load compiler/nvidia before compiling

  • Use nvcc for C/C++ CUDA code and nvfortran for CUDA Fortran code

  • Exit your interactive job with exit when finished