OpenMPI
The OpenMPI Project is an open source Message Passing Interface implementation that is developed and maintained by a consortium of academic, research, and industry partners. OpenMPI is therefore able to combine the expertise, technologies, and resources from all across the High Performance Computing community in order to build the best MPI library available. OpenMPI offers advantages for system and software vendors, application developers and computer science researchers.
Versions
You can load a specific version using one of the following:
module load OpenMPI/3.1.4-GCC-8.3.0 # part of the foss-2019b toolchain
module load OpenMPI/4.0.3-GCC-9.3.0 # part of the foss-2020a toolchain
module load OpenMPI/4.0.5-GCC-9.3.0 # part of the foss-2020a toolchain
module load OpenMPI/4.0.5-GCC-10.2.0 # part of the foss-2020b toolchain
module load OpenMPI/4.1.4-GCC-11.3.0 # part of the foss-2022a toolchain
module load OpenMPI/4.1.4-GCC-12.2.0 # part of the foss-2022b toolchain
Warning
The current installation of OpenMPI 4.1.1 has poor performance over Omnipath which is still under investigation.
See here for a brief guide to the new features in OpenMPI 4.x and here for a detailed view of the changes between OpenMPI versions.
Example
Consider the following source code hello.c
:
#include <mpi.h>
#include <stdio.h>
int main(int argc, char** argv) {
// Initialize the MPI environment
MPI_Init(NULL, NULL);
// Get the number of processes
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
// Get the rank of the process
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
// Get the name of the processor
char processor_name[MPI_MAX_PROCESSOR_NAME];
int name_len;
MPI_Get_processor_name(processor_name, &name_len);
// Print off a hello world message
printf("Hello world from processor %s, rank %d out of %d processors\n",
processor_name, world_rank, world_size);
// Finalize the MPI environment.
MPI_Finalize();
}
MPI_COMM_WORLD (which is constructed for us by MPI) encloses all of the processes in the job, so this call should return the amount of processes that were requested for the job.
Compile your source code by using on of the following commands:
mpicc hello.c -o hello
Note
In this example we used the MPI C compiler. We could also choose to compile with either of the MPI C++ compilers mpicxx
or mpiCC
Interactive job submission
You can run your job interactively (from a login node):
srun hello
Your output would be something like:
Hello world from processor node003.pri.stanage.alces.network, rank 0 out of 1 processors
This is an expected behaviour since by default interactive jobs get allocated one single-CPU-core task running on one node. You can request an interactive job with multiple concurrent single-CPU-core tasks (4 in this example) by using this command (from a login node):
srun --ntasks=4 hello
Your output would be something like:
Hello world from processor node003.pri.stanage.alces.network, rank 3 out of 4 processors
Hello world from processor node003.pri.stanage.alces.network, rank 1 out of 4 processors
Hello world from processor node003.pri.stanage.alces.network, rank 0 out of 4 processors
Hello world from processor node003.pri.stanage.alces.network, rank 2 out of 4 processors
Please note that requesting multiple cores in an interactive node depends on the availability. During peak times, it is unlikely that you can successfully request a large number of CPU cores interactively. Therefore, it is usually sensible to run MPI workloads as batch jobs.
Non-interactive job submission
Write a shell script (minimal example). We name the script as test.sh
:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8
module load OpenMPI/4.1.4-GCC-12.2.0
srun --export=ALL hello
Submit your script by using the command:
sbatch test.sh
Your output would be something like:
Hello world from processor node003.pri.stanage.alces.network, rank 6 out of 8 processors
Hello world from processor node003.pri.stanage.alces.network, rank 5 out of 8 processors
...
Hello world from processor node003.pri.stanage.alces.network, rank 1 out of 8 processors
Hello world from processor node003.pri.stanage.alces.network, rank 4 out of 8 processors
Installation notes
This section is primarily for administrators of the system. OpenMPI has been installed using the default Easybuild config files but with the following tweaks made via EasyBuild hooks:
- Compile-time options set so that:
All versions compiled with Slurm and PMIx support enabled.
Versions older than 4.1.0 are compiled with support for the PSM2 library for efficient inter-process communication inc via Omni-Path (but OpenMPI only actually uses PSM2 at runtime for versions <= 4.0.0).
- Module files are patched so that at runtime:
When OpenMPI is loaded, it instructs Slurm at runtime (via an environment variable -
SLURM_MPI_TYPE=pmix_v4
) that PMIx version 4 is to be used for launching remote processes usingsrun
.Versions greater than 4.0.0 are configured at runtime to use LibFabric (OFI) for inter-process communications, which in turn is configured at runtime via environment variables to use the PSM2 OFI provider for efficient OmniPath support.
OFI is used instead of PSM2 as the older PSM2 library on EL7 is incompatible with newer versions of OpenMPI, plus OFI is now the preferred way of doing comms over Omni-Path fabrics with MPI implementations.
- Key variables set in OpenMPI module files:
OMPI_MCA_pml=cm
OMPI_MCA_mtl=ofi
OMPI_MCA_mtl_ofi_provider_include=psm2
PMIX_MCA_psec=native
Build logs and test reports can be found in $EBROOTOPENMPI/easybuild
with a given module loaded.