Intel MPI
“Intel MPI Library is a multifabric message-passing library that implements the open-source MPICH specification. Use the library to create, maintain, and test advanced, complex applications that perform better on HPC clusters based on Intel® processors.”
Versions
You can load a specific version using one of the following:
module load impi/2019.9.304-iccifort-2020.4.304 # subset of intel 2020b EasyBuild toolchain
module load impi/2019.7.217-iccifort-2020.1.217 # subset of intel 2020a EasyBuild toolchain
module load impi/2018.5.288-iccifort-2019.5.281 # subset of intel 2019b EasyBuild toolchain
module load impi/2018.4.274-iccifort-2019.1.144-GCC-8.2.0-2.31.1 # subset of intel 2019a EasyBuild toolchain
module load impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30 # subset of intel 2018b EasyBuild toolchain
which implicitly load versions of icc, ifort (and GCC).
Warning
Multi-node jobs are (for the most part) not permitted by Slurm on Bessemer; the system has been configured this way as Bessemer, unlike Stanage, doesn’t have a very high-bandwidth, low-latency network connecting all worker nodes.
Examples
Two examples are given below, the first assessing the MPI performance and the second demonstrating the use of the Intel MPI compilers.
Example: MPI Performance testing
A simple test of these modules can be performed by running the built in performance benchmark tests supplied by Intel. An example of this using 2 cores is given below:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=2
module load impi/2018.5.288-iccifort-2019.5.281
MACHINEFILE="machinefile.$JOB_ID"
# Show which nodes you have been allocated CPU cores on
echo -e "\nShow node core allocation:\n"
NODELIST=nodelist.$SLURM_JOB_ID
srun -l bash -c 'hostname' | sort | awk '{print $2}' > $NODELIST
cat $NODELIST
echo -e "\nBegin running application:\n"
srun --export=ALL IMB-MPI1
This will generate output of the form:
Show node core allocation:
bessemer-node006.shef.ac.uk
bessemer-node006.shef.ac.uk
Begin running application:
#------------------------------------------------------------
# Intel (R) MPI Benchmarks 2018, MPI-1 part
#------------------------------------------------------------
# Date : Mon Sep 27 09:48:58 2021
# Machine : x86_64
# System : Linux
# Release : 3.10.0-1160.36.2.el7.x86_64
# Version : #1 SMP Wed Jul 21 11:57:15 UTC 2021
# MPI Version : 3.1
# MPI Thread Environment:
This is followed by a series of test benchmark results for each of the many tests.
Example: Using the Intel MPI compilers
Another simple test of these modules can be performed by compiling and running the example executable provided by Intel. An example of this using 2 cores is given below:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=2
module load impi/2018.5.288-iccifort-2019.5.281
# Show which nodes you have been allocated CPU cores on
echo -e "\nShow node core allocation:\n"
NODELIST=nodelist.$SLURM_JOB_ID
srun -l bash -c 'hostname' | sort | awk '{print $2}' > $NODELIST
cat $NODELIST
cd /fastdata/$USER
cp -R $I_MPI_ROOT/test ./ && chmod 700 -R test && cd test/
# Compiling the fortran example
mpif90 test.f90
# Alternatively you can compile the C example instead
#mpicc test.c
echo -e "\nBegin running application:\n"
srun --export=ALL /fastdata/$USER/test/a.out
This will generate output of the form:
Show node core allocation:
bessemer-node006.shef.ac.uk
bessemer-node006.shef.ac.uk
Begin running application:
Hello world: rank 0 of 2 running on bessemer-node006.shef.ac.uk $
Hello world: rank 1 of 2 running on bessemer-node006.shef.ac.uk