Attention

The Bessemer HPC service was decommissioned on 2025-10-31 and can no longer be accessed by users. Removal of Bessemer references in our documentation is ongoing

Using GPUs on Stanage

There are three types of GPU node in Stanage which differ in terms of GPU architecture (NVIDIA A100, H100, and H100 NVL), the number of GPUs per node and GPU interconnect technologies (inc bandwidth) (see Stanage hardware specifications). At present you need to decide which node type to target when submitting a batch job or starting an interactive session on a worker node.

Before proceeding, ensure you’ve worked through our introductory GPU tutorial.

Interactive use of the GPUs

Note

See requesting an interactive session on slurm if you’re not already familiar with the concept.

Attention

Interactive use of GPUs is strongly discouraged, as they are a valuable and limited resource. Please use interactive GPU sessions only for short debugging, essential visualisation, or compiling GPU-enabled software. All other GPU workloads must be submitted as batch jobs.

To start an interactive session with access to one GPU on a GPU node (Stanage hardware specifications):

srun --partition=gpu --qos=gpu --gres=gpu:1 --mem=82G --pty bash

Note: you can now request GPUs using --gpus=N on Stanage (as an alternative to --gres=gpu:N), following a recent Slurm upgrade.

Interactive sessions default to just 4016 MB of CPU RAM, which is far less than the 80 GB of GPU RAM available on each NVIDIA A100 or H100 GPU or 94 GB of GPU RAM available on each NVIDIA H100 NVL. This mismatch can cause problems — for instance, failing to transfer data between CPU and GPU due to insufficient CPU-side memory.

The examples above deliberately request slightly more CPU memory than the total memory associated with the requested GPU(s).

Please also carefully consider your --cpus-per-task and --time requests - shorter sessions tend to start sooner.

Warning

  • Usage of the H100 GPUs requires the --partition=gpu-h100 and --gres=gpu:1 arguments to be set in your submission scripts.

  • Usage of the H100 NVL GPUs requires the --partition=gpu-h100-nvl and --gres=gpu:1 arguments to be set in your submission scripts.

This is to ensure usage is “opt in” by users as the slightly different architecture of these GPUs to the existing A100 GPUs may necessitate changes to batch submission scripts and selected software versions.

Submitting GPU batch jobs

Note

See submitting jobs on slurm if you’re not already familiar with the concept.
Each user can use at most 12 GPUs concurrently (A100 + H100 combined); further jobs will wait until prior ones release GPUs. (Reduced from 16 in Aug 2025.)

To run batch jobs on GPU nodes, ensure your job submission script includes a request for GPUs, e.g. for two GPUs use --gres=gpu:2:

#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --qos=gpu
#SBATCH --gres=gpu:2
#SBATCH --mem=82G

# Your code below...

Requesting GPUs and multiple CPU cores from the scheduler

To request four separate Slurm tasks within a job, each of which has eight CPU cores and with four (A100) GPUs available to the entire job (shared between tasks):

#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --qos=gpu
#SBATCH --nodes=1
#SBATCH --ntasks=4
#SBATCH --cpus-per-task=8
#SBATCH --gres=gpu:4       # 4 GPUs for job

Note that:

  • The GPUs are (unintuitively) shared between the Slurm tasks.

  • It’s not possible to request --gpus-per-node, --gpus-per-task or --gpus-per-socket on Stanage at this time (unlike on Bessemer).

  • Not all nodes have four GPUs (Stanage hardware specifications).

Architecture differences

CPU Architecture

Stanage GPU nodes contain a mixture of CPU architectures. The A100 and H100 GPU nodes use AMD-based CPUs while the H100 NVL GPU nodes use Intel-based CPUs. In some cases your software may need to be re-compiled depending on which GPU node it is run on.

GPU Architecture

The A100 GPUs are based on the Ampere architecture (sm80) while the H100 and H100 NVL GPUs are based on the slightly newer Hopper architecture (sm90). In some cases your software may need to be re-compiled depending on which GPU it is run on.

Note

While both the H100-NVL and H100 GPU nodes contain “H100” GPUs, the H100 NVL variant in the H100-NVL nodes is substantially different from the H100 GPUs in the H100 nodes. The H100 NVL is a newer GPU featuring 15% more CUDA and tensor cores alongside 95% more memory bandwidth than the older H100 GPU. In some work loads the H100 NVL may be up to 50% faster in practice.

Stanage GPU Resources

GPU-enabled Software

Training materials