Cluster Specifications
Operating System and software
OS: Centos 7.x (binary compatible with RedHat Enterprise Linux 7.x) on all nodes
Interactive and batch job scheduling software: Slurm
Many applications, compilers, libraries and parallel processing tools. See Software on Stanage
Networking
Intel OmniPath Architecture (OPA) (100 Gb/s) to all public nodes
Gigabit Ethernet
General CPU node specifications
152 nodes are publicly available (not exclusive to research groups).
Machine: Dell PowerEdge R650
CPUs: 2 x 32 core Intel Xeon Platinum 8358
Ice Lake processor microarchitecture;
Base clock 2.60 GHz; Boost clock 3.40 GHz;
Support for AVX-512 vectorisation instructions (simultaneously apply the same operation to multiple values in hardware);
Support for Fused Multiply-Add instructions (expedites operations involving the accummulation of products e.g. matrix multiplication).
Hyperthreading is disabled on all nodes.
RAM: 256 GB (i.e. 4.0 GiB / core)
3200 MT/s;
DDR4.
Large memory node specifications
To complement the standard nodes with 256GB of memory per node (4GB/core), there are 12 large memory nodes with 1TB (16GB/core), and a further 12 very large nodes with 2TB (32GB/core).
Large memory nodes
12 nodes are publicly available (not exclusive to research groups).
Machine: Dell PowerEdge R650
CPUs: 2 x 32 core Intel Xeon Platinum 8358
Ice Lake processor microarchitecture;
Base clock 2.60 GHz; Boost clock 3.40 GHz;
Support for AVX-512 vectorisation instructions (simultaneously apply the same operation to multiple values in hardware);
Support for Fused Multiply-Add instructions (expedites operations involving the accummulation of products e.g. matrix multiplication).
Hyperthreading is disabled on all nodes.
RAM: 1024 GB (i.e. 16.0 GiB / core)
3200 MT/s;
DDR4.
Very large memory nodes
12 nodes are publicly available (not exclusive to research groups).
Machine: Dell PowerEdge R650
CPUs: 2 x 32 core Intel Xeon Platinum 8358
Ice Lake processor microarchitecture;
Base clock 2.60 GHz; Boost clock 3.40 GHz;
Support for AVX-512 vectorisation instructions (simultaneously apply the same operation to multiple values in hardware);
Support for Fused Multiply-Add instructions (expedites operations involving the accummulation of products e.g. matrix multiplication).
Hyperthreading is disabled on all nodes.
RAM: 2048 GB (i.e. 32.0 GiB / core)
3200 MT/s;
DDR4.
GPU node specifications
16 nodes will be publicly available (not exclusive to research groups).
Tip
To be added to Stanage in the near future.
Machine: Dell XE8545
CPUs: 2 x 24 core AMD EPYC 7413
Zen 3 processor microarchitecture;
Base clock 2.65 GHz; Boost clock 3.60 GHz;
Hyperthreading is disabled on all nodes.
RAM: 512 GB (i.e. 32.0 GiB / core)
3200 MHz;
DDR4.
Local storage: 460 GB boot device (SSD) plus 2.88 TB ‘/scratch’ temporary storage (RAID 0 on SSDs)
GPUs: 4x NVIDIA A100, each with
High-bandwidth, low-latency NVLink GPU interconnects
80GB memory (HBM2e)
Non-worker nodes
Two login nodes (for resilience)
Other nodes to provide:
Lustre parallel filesystem
Slurm scheduler ‘head’ nodes
Total capacity
With all workers including GPU and large memory nodes:
Worker nodes: 192
CPU cores: 12032
Intel Cores: 11264
AMD Cores: 768
Total memory: 83968 GiB
GPUs: 64
Fast network filesystem (Lustre): 2 PiB