Bede (Tier 2 GPU cluster)¶
Bede is a EPSRC-funded ‘Tier 2’ (regional) GPU-equipped HPC cluster. The system is available for use by researchers from N8 Research Partnership institutions (Durham, Lancaster, Leeds, Liverpool, Manchester, Newcastle, Sheffield and York).
NB the system was previously known as NICE-19.
This system is particularly well suited to supporting:
Jobs that benefit from distributing work between multiple GPUs and possibly multiple nodes.
Jobs that require much movement of data between CPU and GPU memory.
In particular deep learning and machine learning workflows that meet either of the above criteria.
Academics/researchers can apply for access to the system (see Further Information) but note that some aspects of the system plus the registration and support mechanisms are still being refined.
Noteworthy features of the system¶
32x GPU nodes (IBM AC922 nodes) each with
4x ‘inference’ nodes (IBM IC922 nodes) each with
High-bandwidth, low-latency networking between nodes (100 Gb/s EDR Infiniband)
High-performance parallel file system (Lustre)
Slurm job scheduler
IBM Watson Machine Learning Community Edition
Includes Conda packages for helping transparently distribute Deep Learning training and inference tasks over multiple GPUs and/or nodes when using e.g. TensorFlow, IBM Caffe and Pytorch.
Includes conda packages for accelerating the training of generalized linear models (e.g. in scikit-learn and Apache Spark) using GPUs and multiple nodes
Standard GNU toolkit via the IBM Advanced Toolchain for Linux
Inc. IBM-optimised GNU compilers, BLAS/LAPACK, glibc, gdb, valgrind, itrace, Boost, Python, Go and more
NVIDIA profilers and debuggers
See the N8 CIR’s Bede site for:
Documentation on how to use the system
Information on per-institution RSE support (including the contact for Sheffield)
How to register a project
How to acknowledge Bede and the N8 CIR in publications
Bede and N8 CIR logos
email@example.com if you have any questions regarding Bede in general and the application process.