Attention

WARNING: From 9am on 19th August until 5pm on 2nd September there will be no access to the Stanage HPC cluster.

We will send an email to notify you when Stanage is back online and available for job submission.

Need help?

If you need some help the Research and Innovation team in IT Services have a lot of resources available to help you, please read through these resources below to find the most appropriate resources for your issues.


HPC Documentation

The HPC documentation we provide (this website) is regularly updated and contains a large amount of info about using our HPC clusters. Users should at minimum ensure they have read through the Using the HPC Systems section as it contains a concise summary of relevant information required to start using our HPC clusters efficiently and optimally.


General HPC Problems

If you are struggling to get something working on the HPC you should first have a look through our documentation on the technology (e.g. Message Passing Interface (MPI)), software (e.g. Software on Stanage ) or cluster (e.g. Bessemer) you are using to see if we have already installed the software / addressed your problem.

You should also make sure you look through our Frequently Asked Questions section as it covers many common mistakes. If you are having issues with your jobs failing please have a look through the section on job debugging on the Stanage and Bessemer clusters.

Note

Please make sure you check through our resources prior to contacting us as we are a small team with limited resources.

If you are still having issues or need specific advice e.g. how to best parallelise your workflow, please contact IT Services’ Research and Innovation team or if you have more specific queries about programming / coding for HPC clusters e.g. CUDA programming please contact the Research Software Engineering team.

Use the Email support template for guidance to get the best help efficiently. If possible, please include the steps, screenshots, scripts and job log files for IT Services Support teams to reproduce the issue.

If you encounter issues with the link above, use the template provided in the code block below. Copy the content and paste it into your email body, replacing placeholders (e.g., [module1], [software1], [jobid1]) with specific details about your issue.

Subject: HPC Help Request - [CLUSTER_USERNAME] - [SHORT DESCRIPTION]


Hi there,

I confirm that I have checked the FAQs at https://docs.hpc.shef.ac.uk/en/latest/FAQs.html#frequently-asked-questions for a possible solution to my issue.


Issue Summary
Please provide a clear and concise description of the issue.


System Details
Cluster Name: Stanage / Bessemer

Environment modules loaded:
1. [module1]
2. [module2]
3. [module3]

List of Software / Libraries used (with the versions):
1. [software1] [version]
2. [software2] [version]
3. [software3] [version]


Specific Details about the issue
Job IDs:
1. [jobid1]
2. [jobid2]
3. [jobid3]

Error messages and the command leading to the error:
Please place your screenshots of your error messages here.

Steps leading to the problem:
1. [step1]
2. [step2]
3. [step3]

Relevant scripts and job log files:
Please attach your scripts and job log files as an attachment.

HPC training?

IT Services’ Research and Innovation courses

If you are new to the cluster, have never used Linux or HPC before you should attend the RIT 101 (Introduction to Linux) and RIT 102 (High-Performance Computing) courses.

These courses are very popular and run through both semesters. You can find details and how to register at the website: https://sites.google.com/sheffield.ac.uk/research-training/ (Only accessible with the VPN turned on.)

IT Services’ Research and Innovation training index

The Research and Innovation team’s training index allows you to search for internal (to TUoS) and external training resources covering categories including HPC, Data Analysis / Visualisation, containerisation as well as domain specific resources such as FEA, CFD, Chemistry and more.

This site is currently in beta and more links are resources are being added.