Embarrassingly Parallel
Overview
Slurm job arrays enable you to submit jobs that run multiple times with the same Slurm parameters.
Use the --array
Slurm argument to define array indices, e.g. --array=1-10,12-15
.
The $SLURM_ARRAY_TASK_ID
environment variable provides each job with its
corresponding array index.
Example:
#!/bin/bash
#SBATCH --array=1-10
# Each job processes a different input file
srun python my_script.py input_${SLURM_ARRAY_TASK_ID}
Below are different templates that you can modify to suit your tasks.
If you are uncertain about scaling up, contact the IT Services’ Research and Innovation team at an early stage.
In scientific computing, it is often necessary to run the same program multiple times with varying datasets or parameters.
When these runs do not depend on or communicate with each other, they can be executed in parallel as separate Slurm jobs. This type of parallelism is referred to as embarrassingly parallel.
Slurm provides a feature called job arrays, which allows users to efficiently submit and manage multiple independent instances of the same job script.
Array jobs enable you to manage large-scale workloads on the cluster. In Parallel Computing, we explore alternatives.
Note
Although the examples here are drawn from scientific computing, job arrays are equally useful in many other fields. You might use them to analyse batches of text files, run different models on survey data, or automate large-scale media processing — anywhere you need to repeat a task across multiple inputs or parameters.
Introduction
Array jobs facilitate parallel computations. They are useful when you need to run a job multiple times with only minor variations. For example, you may need to execute 1000 jobs, each with a different random seed, or apply the same operation across multiple datasets. This can be accomplished with a single array job.
A Slurm job array consists of multiple jobs that share the same batch submission
script. The --array
directive specifies how many times the script
should be executed, for instance:
#SBATCH --array=0-4
This command creates an array of five jobs (tasks) indexed from 0 to 4.
Each task is a duplicate of the submitted batch script, automatically
queued in Slurm. The SLURM_ARRAY_TASK_ID
environment variable is
used to assign a unique identifier to each task, which can be leveraged
for handling input/output files.

--array
via the command line
The --array
option can also be specified as a command-line argument
when using sbatch
. This is useful for managing job arrays without
modifying the script.
Important
Since array jobs create multiple identical job instances, it is crucial to understand their impact on the file system:
Does the script rely on libraries or environments stored in the working directory?
How much input data does each task require?
How much output data does each job generate?
For example, launching an array job with hundreds of tasks that depend on a Python environment stored on shared storage may cause significant file system load due to repeated access to thousands of files.
If you are unsure how your job will behave, seek guidance from the IT Services’ Research and Innovation team.
Your First Array Job
Note
The necessary scripts for the upcoming exercises are located in our hpc-examples repository. This repository is accessible on our Stanage HPC cluster. To utilise it, load the module:
module load hpc-examples
After loading, you can access the examples scripts via the $HPC_EXAMPLES
environment variable.
For example, you can then run slurm/pi.py
in the following way:
python $HPC_EXAMPLES/slurm/pi.py
Let’s see an array job in practice. Look at the script ${HPC_EXAMPLES}/array/array_example.sh
#!/bin/bash
#SBATCH --time=00:15:00
#SBATCH --mem=200M
#SBATCH --output=array_example_%A_%a.out
#SBATCH --array=0-15
# You may put the commands below:
# Job step
srun echo "I am array task number" $SLURM_ARRAY_TASK_ID
Submitting the job script with sbatch ${HPC_EXAMPLES}/array/array_example.sh
will return a message such as:
Submitted batch job 5825026
This job ID belongs to the primary array job, which encompasses all individual tasks in the array. Each task is also assigned a unique array task ID.
As multiple jobs run simultaneously, each requires a unique output file
to prevent overwriting. By default, Slurm names the output files as
slurm-${SLURM_ARRAY_JOB_ID}_${SLURM_ARRAY_TASK_ID}.out
.
You can override this with --output=FILENAME
, using placeholders
%A
for the job ID and %a
for the array task ID.
Once the jobs complete, output files will appear in your working directory:
$ ls
array_example_5825026_0.out array_example_5825026_12.out array_example_5825026_15.out
array_example_5825026_3.out array_example_5825026_6.out array_example_5825026_9.out
array_example_5825026_10.out array_example_5825026_13.out array_example_5825026_1.out
array_example_5825026_4.out array_example_5825026_7.out array_example.sh
array_example_5825026_11.out array_example_5825026_14.out array_example_5825026_2.out
array_example_5825026_5.out array_example_5825026_8.out
You can inspect any output file using cat
:
$ cat array_example_5825026_11.out
I am array task number 11
Important
Array indices do not need to be sequential. If specific tasks fail,
you can re-run only those with --array=1,4
.
The --array
argument can also be supplied directly to sbatch
from the command line.
More Examples
The following examples demonstrate how to effectively use job arrays and leverage the $SLURM_ARRAY_TASK_ID
environment variable.
You need a clear mapping between job indices and configurations, which could be filenames, pre-defined parameter sets, or external configuration files.
Ensure the mapping remains consistent throughout multiple job runs to avoid inconsistencies.
Processing Multiple Input Files
Often, computations require processing different input files. The $SLURM_ARRAY_TASK_ID
variable can dynamically assign files to jobs:
#!/bin/bash
#SBATCH --time=01:00:00
#SBATCH --mem=1G
#SBATCH --array=0-29
srun ./my_application -input input_data_${SLURM_ARRAY_TASK_ID}
Hardcoding Arguments in the Batch Script
Suppose you want to run a π estimation simulation with five different seed values, each executing 2.5 million iterations. The following two examples show different ways to embed these arguments in the batch script.
Case-Based Argument Selection
The script ${HPC_EXAMPLES}/array/pi_array_hardcoded_case.sh
uses a case statement to choose arguments:
#!/bin/bash
#SBATCH --time=01:00:00
#SBATCH --mem=500M
#SBATCH --job-name=pi-array-hardcoded
#SBATCH --output=pi-array-hardcoded_%a.out
#SBATCH --array=0-4
module load hpc-examples
echo "[DEBUG] Module path: $MODULEPATH"
module list || echo "[DEBUG] Failed to load hpc-examples"
echo "[DEBUG] HPC_EXAMPLES = $HPC_EXAMPLES"
ls -l ${HPC_EXAMPLES}/slurm/pi.py
case $SLURM_ARRAY_TASK_ID in
0) SEED=234 ;;
1) SEED=18 ;;
2) SEED=23 ;;
3) SEED=50 ;;
4) SEED=432 ;;
esac
python3 ${HPC_EXAMPLES}/slurm/pi.py 2500000 --seed=$SEED > pi_$SEED.json
Submit the script with:
$ module load hpc-examples
$ sbatch ${HPC_EXAMPLES}/array/pi_array_hardcoded_case.sh
Submitted batch job 5825718
Each task produces its own output, such as:
$ cat pi_18.json
{"pi_estimate": 3.1411456, "iterations": 2500000, "successes": 1963216}
Using Bash Arrays for Parameter Selection
An alternative approach using Bash arrays is demonstrated in the script ${HPC_EXAMPLES}/array/pi_array_hardcoded_array.sh
:
#!/bin/bash
#SBATCH --time=01:00:00
#SBATCH --mem=500M
#SBATCH --job-name=pi-array-hardcoded-case
#SBATCH --output=pi-array-hardcoded-case_%a.out
#SBATCH --array=0-4
SEED_ARRAY=(
234
18
23
50
432
)
SEED=${SEED_ARRAY[$SLURM_ARRAY_TASK_ID]}
srun python3 ${HPC_EXAMPLES}/slurm/pi.py 2500000 --seed=$SEED > pi_$SEED.json
Submit the job with:
$ module load hpc-examples
$ sbatch ${HPC_EXAMPLES}/array/pi_array_hardcoded_array.sh
Reading Parameters from a File
Rather than hardcoding values, you can store them in a file and read them dynamically. For example, running pi.py
with different iteration values:
Create a file named iterations.txt
containing:
100
1000
50000
1000000
We can modify the previous script to read values using sed
(see sed and man sed
)
${HPC_EXAMPLES}/array/pi_array_parameter.sh
:
#!/bin/bash
#SBATCH --time=01:00:00
#SBATCH --mem=500M
#SBATCH --job-name=pi-array-parameter
#SBATCH --output=pi-array-parameter_%a.out
#SBATCH --array=1-4
ml hpc-examples
n=$SLURM_ARRAY_TASK_ID
iteration=`sed -n "${n} p" iterations.txt` # Get n-th line (1-indexed) of the file
srun python3 ${HPC_EXAMPLES}/slurm/pi.py ${iteration} > pi_iter_${iteration}.json
This approach can be extended to read multiple parameters from CSV files or similar structured data formats.
Two-Dimensional Array Scanning (Advanced)
If your tasks are very short (a few minutes), launching numerous individual jobs can lead to scheduling inefficiencies and an overwhelming number of output files. In such cases, it is beneficial to group multiple tasks within a single array job.
Important
Ideally, each array job should run for at least 30 minutes. If your tasks are shorter than this, consider combining multiple runs into a single job to reduce scheduling overhead and improve efficiency.
A simple way to achieve this is by introducing a loop inside your Slurm script. For example, if you need to run a simulation with 50 different seed values, you can process them in groups of 10, reducing the number of array jobs to just 5. This significantly decreases the load on the scheduler.
An example implementation is provided in the script ${HPC_EXAMPLES}/array/pi_array_grouped.sh
:
#!/bin/bash
#SBATCH --time=01:00:00
#SBATCH --mem=500M
#SBATCH --job-name=pi-array-grouped
#SBATCH --output=pi-array-grouped_%a.out
#SBATCH --array=1-4
ml hpc-examples
# Lets create a new folder for our output files
mkdir -p json_files
CHUNKSIZE=10
n=$SLURM_ARRAY_TASK_ID
indexes=`seq $((n*CHUNKSIZE)) $(((n + 1)*CHUNKSIZE - 1))`
for i in $indexes
do
srun python3 ${HPC_EXAMPLES}/slurm/pi.py 1500000 --seed=$i > json_files/pi_$i.json
done
Exercises
Array Job Exercises
Array-1: Compute n-grams with Array Jobs
Computing n-grams across the Gutenberg-Fiction dataset can take considerable time. Using array jobs is an efficient way to parallelise the process. Follow along with this example:
The following batch script ${HPC_EXAMPLES}/ngrams/array.sh
calculates 3-grams in 20 batches, saving each result
to a separate file:
#!/bin/bash
#SBATCH --mem=50G
#SBATCH --array=1-20
#SBATCH --time=00:15:00
#SBATCH --job-name=words-array
module load hpc-examples
mkdir -p ${TMPDIR}/ngrams-output/
python3 ${HPC_EXAMPLES}/ngrams/count.py ${HPC_EX_DATA}/Gutenberg-Fiction.zip \
-n 3 --words \
--start=$SLURM_ARRAY_TASK_ID --step=20 \
-o ${TMPDIR}/ngrams-output/ngrams3-words-all-array_$SLURM_ARRAY_TASK_ID.out
# Next we merge the 20 individual output files into a single dataset and output
# into the current working directory (since once the job completes $TMPDIR will be deleted).
python3 ${HPC_EXAMPLES}/ngrams/combine-counts.py ${TMPDIR}/ngrams-output/ngrams3-words-all-array_* -o ngrams3-words-all.out
The final output now contains all computed n-grams:
$ head -5 ngrams3-words-all.out
30224 ["i", "don", "t"]
18737 ["one", "of", "the"]
15954 ["out", "of", "the"]
14749 ["there", "was", "a"]
13122 ["it", "was", "a"]
Further Exercises
Array-2: Array Jobs and Random Seeds
Create an array job that runs ${HPC_EXAMPLES}/slurm/pi.py
with different combinations of
iteration counts and seed values. Save the results to separate files and keep
the standard output (#SBATCH --output=FILE
) distinct from the standard error (#SBATCH --error=FILE
).
Array-3: Merging Outputs
Use the script ${HPC_EXAMPLES}/slurm/pi_aggregation.py
to aggregate
results from multiple output files. This will compute a more precise estimate
of Pi.
(Advanced) Array-5: Using Advanced Indexing
Create a job array that runs every alternate index, such as 1, 3, 5, etc. The slurm sbatch manual page provides helpful details.
Solution
You can specify a step size for the job array using a colon and number after the range. For example: --array=1-X:2
See Also
Our reference material on Job or task arrays
If you are uncertain about scaling up, contact the IT Services’ Research and Innovation team at an early stage.
What’s Next?
The next tutorial covers shared memory parallelism.
This material contains material adapted from Aalto Scientific Computing Documentation, licensed under CC BY 4.0 Changes were made to the original content.