GPU nodes (Computer Science)

GPU nodes purchased for Bessemer by the Department of Computer Science (DCS) for use by DCS research staff, their collaborators and their research students.

Hardware specifications

Eight nodes (bessemer-node030 to bessemer-node037) each have:

Processors

2x Intel Xeon Gold 6138 (2.00GHz; 40 cores per CPU)

RAM

192GB (DDR4 @ 2666 MHz)

NUMA nodes

2x

GPUS

4x NVIDIA Tesla V100 SXM2 (32GB RAM each; NVLINK interconnects between GPUs)

Networking

25 Gbps Ethernet

Local storage

140 GB of temporary storage under /scratch (2x SSD RAID1)

Requesting Access

Access to the node is managed by the RSE team. Access policy:

  • PhD students, researchers and staff in Computer Science can all request access to the nodes.

  • Access to others who are collaborating on projects with some Computer Science / RSE involvement can be made on a case-by-case basis.

  • Access to Computer Science MSc students can be made on a case-by-case basis.

A number of other users were granted access before this policy was developed.

To request access complete this Google Form and someone within the RSE team will then respond with further information.

Using the nodes

There are several ways to access these nodes. The type of access granted for a job depends on which Slurm Account and Partition are requested at job submission time.

1. DCS test/debugging access

E.g. for short test batch jobs or for interactive debugging.

To submit a job via this route, you need to specify a *Partition* and *Account* when submitting a batch job or starting an interactive session:

  • Partition: dcs-gpu-test

  • Account: dcs-res (members of DCS) or dcs-collab (collaborators of DCS)

  • QoS: do not specify one i.e. do not use the --qos parameter.

Resource limits per job:

Each user can run a maximum of two of these jobs concurrently.

2. DCS access for larger jobs

If you want to run a longer job that uses up to all the resources available in one of these nodes then you can specify a different Partition when submitting a batch job or starting an interactive session:

  • Partition: dcs-gpu

  • Account: dcs-res (members of DCS) or dcs-collab (collaborators of DCS)

  • QoS: do not specify one i.e. do not use the --qos parameter.

Please only run batch jobs this way: long-running interactive sessions that are associated with large resource requests are often an inefficient way of using cluster resources.

Resource limits per job: