Attention

WARNING: From 9am on 19th August until 5pm on 2nd September there will be no access to the Stanage HPC cluster.

We will send an email to notify you when Stanage is back online and available for job submission.

Intel MPI

“Intel MPI Library is a multifabric message-passing library that implements the open-source MPICH specification. Use the library to create, maintain, and test advanced, complex applications that perform better on HPC clusters based on Intel® processors.”

Versions

You can load a specific version using one of the following:

module load impi/2019.9.304-iccifort-2020.4.304  # subset of intel 2020b EasyBuild toolchain
module load impi/2019.7.217-iccifort-2020.1.217  # subset of intel 2020a EasyBuild toolchain
module load impi/2018.5.288-iccifort-2019.5.281  # subset of intel 2019b EasyBuild toolchain
module load impi/2018.4.274-iccifort-2019.1.144-GCC-8.2.0-2.31.1  # subset of intel 2019a EasyBuild toolchain
module load impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30 # subset of intel 2018b EasyBuild toolchain

which implicitly load versions of icc, ifort (and GCC).

Warning

Multi-node jobs are (for the most part) not permitted by Slurm on Bessemer; the system has been configured this way as Bessemer, unlike Stanage, doesn’t have a very high-bandwidth, low-latency network connecting all worker nodes.

Examples

Two examples are given below, the first assessing the MPI performance and the second demonstrating the use of the Intel MPI compilers.

Example: MPI Performance testing

A simple test of these modules can be performed by running the built in performance benchmark tests supplied by Intel. An example of this using 2 cores is given below:

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=2

module load impi/2018.5.288-iccifort-2019.5.281

MACHINEFILE="machinefile.$JOB_ID"

# Show which nodes you have been allocated CPU cores on
echo -e "\nShow node core allocation:\n"

NODELIST=nodelist.$SLURM_JOB_ID
srun -l bash -c 'hostname' | sort | awk '{print $2}' > $NODELIST
cat $NODELIST


echo -e "\nBegin running application:\n"
srun --export=ALL IMB-MPI1

This will generate output of the form:

Show node core allocation:

bessemer-node006.shef.ac.uk
bessemer-node006.shef.ac.uk

Begin running application:

#------------------------------------------------------------
#    Intel (R) MPI Benchmarks 2018, MPI-1 part
#------------------------------------------------------------
# Date                  : Mon Sep 27 09:48:58 2021
# Machine               : x86_64
# System                : Linux
# Release               : 3.10.0-1160.36.2.el7.x86_64
# Version               : #1 SMP Wed Jul 21 11:57:15 UTC 2021
# MPI Version           : 3.1
# MPI Thread Environment:

This is followed by a series of test benchmark results for each of the many tests.

Example: Using the Intel MPI compilers

Another simple test of these modules can be performed by compiling and running the example executable provided by Intel. An example of this using 2 cores is given below:

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=2

module load impi/2018.5.288-iccifort-2019.5.281

# Show which nodes you have been allocated CPU cores on
echo -e "\nShow node core allocation:\n"

NODELIST=nodelist.$SLURM_JOB_ID
srun -l bash -c 'hostname' | sort | awk '{print $2}' > $NODELIST
cat $NODELIST

cd /fastdata/$USER
cp -R $I_MPI_ROOT/test ./ && chmod 700 -R test && cd test/
# Compiling the fortran example
mpif90 test.f90
# Alternatively you can compile the C example instead
#mpicc test.c

echo -e "\nBegin running application:\n"
srun --export=ALL /fastdata/$USER/test/a.out

This will generate output of the form:

Show node core allocation:

bessemer-node006.shef.ac.uk
bessemer-node006.shef.ac.uk

Begin running application:

Hello world: rank            0  of            2  running on bessemer-node006.shef.ac.uk                                                   $
Hello world: rank            1  of            2  running on bessemer-node006.shef.ac.uk