Attention
The Bessemer HPC service was decommissioned on 2025-10-31 and can no longer be accessed by users. Removal of Bessemer references in our documentation is ongoing
Filestores
Every HPC user has access to up to five different storage areas:
Home directories: per-user storage
Fastdata areas: high-performance shared filesystem for temporary data - optimised for reading/writing large files from multiple nodes and threads simultaneously
Shared (project) directories: per-PI shared storage areas (snapshotted and backed-up) for project data - can be accessed from non-HPC machines too
Scratch directories: per-node temporary storage - useful for reading/writing lots of small files within one job
Community Software Areas: cluster-wide storage areas to allow users to share software.
The storage areas differ in terms of:
the amount of space available;
whether they are available from multiple nodes;
whether they are shared between clusters;
whether the underlying storage system is performant if reading/writing large files;
whether the underlying storage system is performant if reading/writing small files;
frequency of storage snapshotting, whether storage is mirrored and the maximum duration data can be retained for;
whether they handle permissions like a typical Linux filesystem.
At present none provide encryption at rest.
Choosing the correct filestore
To make a quick assessment of what storage area is likely to best fulfil your needs, please take a look at the provided decision tree below:
Warning
This decision tree only provides a quick assessment, please check the full details of each filestore before committing to using them for your work.
Home directories
All users have a home directory on each system:
Home filestore area details
Path |
Type |
Quota per user |
Shared between system login and worker nodes? |
Shared between systems? |
|---|---|---|---|---|
|
NFS |
50 GB or 300000 files |
Yes |
No |
Where $USER is the user’s username.
See also: How to check your quota usage and * If you exceed your filesystem quota.
Home filestore backups and snapshots details
Warning
Snapshotting is not enabled for home areas and these areas are not backed up.
Home filestore area details
Path |
Type |
Quota per user |
Shared between system login and worker nodes? |
Shared between systems? |
|---|---|---|---|---|
|
NFS |
100GB |
Yes |
No |
Where $USER is the user’s username.
See also: How to check your quota usage and * If you exceed your filesystem quota.
Home filestore backups and snapshots details
Frequency of snapshotting |
Snapshots retained |
|---|---|
Every 4 hours |
10 most recent |
Every night |
Last 7 days |
Frequency of mirrored backups |
Backups retained |
|---|---|
Every 4 hours |
6 most recent |
Every night |
28 most recent |
See also: Recovering files from snapshots.
Note
As you can see in the above tabs the full path to your home directory is different depending on the cluster you are on:
Cluster |
Path |
|---|---|
Stanage |
|
Bessemer |
|
To ensure that your code is compatible with both clusters, we suggest using the symbols “~” or “$HOME” to represent the home directory. This approach ensures that the correct path is used regardless of the cluster you are working on, making your code more portable and agnostic to the specific cluster environment.
$ echo $HOME
/users/te1st
$ echo ~
/users/te1st
$ echo $HOME
/home/te1st
$ echo ~
/home/te1st
Fastdata areas
Fastdata areas are optimised for large file operations. These areas are Lustre filesystems.
They are faster than Home directories and Shared (project) directories when dealing with larger files but are not performant when reading/writing lots of small files (Scratch directories are ideal for reading/writing lots of small temporary files within jobs). An example of how slow it can be for large numbers of small files is detailed here.
There are separate fastdata areas on each cluster:
Fastdata filestore area details
Path |
Type |
Quota per user |
Filesystem capacity |
Shared between systems? |
Network bandwith per link |
|---|---|---|---|---|---|
|
Lustre |
No limits |
2 PiB |
No |
100Gb/s (Omni-Path) |
Managing your files in fastdata areas
We recommend users create their own personal folder in the /mnt/parscratch area. As this doesn’t exist by default, you can create it with safe permissions by running the command:
mkdir /mnt/parscratch/users/$USER
chmod 700 /mnt/parscratch/users/$USER
By running the command above, your area will only be accessible to you. If desired, you could have a more sophisticated sharing scheme with private and fully public directories:
mkdir /mnt/parscratch/users/$USER
mkdir /mnt/parscratch/users/$USER/public
mkdir /mnt/parscratch/users/$USER/private
chmod 755 /mnt/parscratch/users/$USER
chmod 755 /mnt/parscratch/users/$USER/public
chmod 700 /mnt/parscratch/users/$USER/private
Note however that the public folder in this instance will be readable to all users!
Fastdata filestore backups and snapshots details
Warning
Snapshotting is not enabled for fastdata areas and these areas are not backed up.
File locking
As of September 2020 POSIX file locking is enabled on all Lustre filesystems. Prior to this the lack of file locking support on the University’s Lustre filesystems caused problems for certain workflows/applications (e.g. for programs that create/use SQLite databases).
User Quota management
Warning
There are no automated quota controls in the Stanage fastdata areas and the Stanage fastdata area currently has no automatic file deletion process.
We reserve the right to prevent unfair use of this area by users and will manually assess user’s usage and establish a dialogue with users who are using unfair amounts of this area on a regular basis.
We also reserve the right to take measures to ensure the continuing functionality of this area which could include scheduled removal of user’s files (after informing the user of the scheduled removal).
Fastdata filestore area details
Path |
Type |
Quota per user |
Filesystem capacity |
Shared between systems? |
Network bandwith per link |
|---|---|---|---|---|---|
|
Lustre |
No limits |
460 TB |
No |
25Gb/s Ethernet |
Managing your files in fastdata areas
We recommend users create their own personal folder in the /fastdata area. As this doesn’t exist by default, you can create it with safe permissions by running the command:
mkdir /fastdata/$USER
chmod 700 /fastdata/$USER
By running the command above, your area will only be accessible to you. If desired, you could have a more sophisticated sharing scheme with private and fully public directories:
mkdir /fastdata/$USER
mkdir /fastdata/$USER/public
mkdir /fastdata/$USER/private
chmod 755 /fastdata/$USER
chmod 755 /fastdata/$USER/public
chmod 700 /fastdata/$USER/private
Note however that the public folder in this instance will be readable to all users!
Fastdata filestore backups and snapshots details
Warning
Snapshotting is not enabled for fastdata areas and these areas are not backed up.
Automatic file deletion
Warning
There are no quota controls in fastdata areas but older files are automatically deleted: a report of files older than 60 days is regularly generated, the owners of these files are then notified by email then a week after the email(s) are sent the identified files are deleted.
We reserve the right to change this policy without warning in order to ensure efficient running of the service.
It is important to therefore not use fastdata areas for long-term storage and copy important data from these areas to areas suitable for longer-term storage (Home directories or Shared (project) directories).
You can use the lfs command to find out which files in a fastdata directory are older than a certain number of days and hence approaching the time of deletion.
For example, if your username is te1st then you can find files 50 or more days old using:
lfs find -ctime +50 /fastdata/te1st
File locking
As of September 2020 POSIX file locking is enabled on all Lustre filesystems. Prior to this the lack of file locking support on the University’s Lustre filesystems caused problems for certain workflows/applications (e.g. for programs that create/use SQLite databases).
Scratch directories
For jobs that need to read/write lots of small files the most performant storage will be the temporary storage on each node.
This is because with Home directories, Fastdata areas and Shared (project) directories, each time a file is accessed the filesystem needs to request ownership/permissions information from another server and for small files these overheads are proportionally high.
For the local temporary store, such ownership/permissions metadata is available on the local machine, thus it is faster when dealing with small files.
As the local temporary storage areas are node-local storage and files/folders are deleted when jobs end:
any data used by the job must be copied to the local temporary store when the jobs starts.
any output data stored in the local temporary store must also be copied off to another area before the job finishes. (e.g. to Home directories).
Further conditions also apply:
Anything in the local temporary store area may be deleted periodically when the worker-node is idle.
The local temporary store area is not backed up.
There are no quotas for local temporary store storage.
The local temporary store area uses the ext4 filesystem.
Danger
The local temporary store areas are temporary and have no backups. If you forget to copy your output data out of the local temporary store area before your job finishes, your data cannot be recovered!
Specifics for each Cluster
The scheduler will automatically create a per-job directory for you under /tmp.
The name of this directory is stored in the $TMPDIR environment variable e.g.
[te1st@login1 [stanage] ~]$ srun -c 1 --mem=4G --pty bash -i
[te1st@node001 [stanage] ~]$ cd $TMPDIR
[te1st@node001 [stanage] ~]$ pwd
/tmp/job.2660172
The scheduler will then clean up (delete) $TMPDIR at the end of your job,
ensuring that the space can be used by other users.
The scheduler will automatically create a per-job directory for you under /scratch.
The name of this directory is stored in the $TMPDIR environment variable e.g.
[te1st@bessemer-login1 ~]$ srun -c 1 --mem=4G --pty bash -i
[te1st@bessemer-node001 ~]$ cd $TMPDIR
[te1st@bessemer-node001 2660172]$ pwd
/scratch/2660172
The scheduler will then clean up (delete) $TMPDIR at the end of your job,
ensuring that the space can be used by other users.
Community Software Areas
Most data that researchers want to share with their collaborators at the University should reside in Shared (project) directories.
However, as mentioned in Permissions behaviour, these areas may not be ideal for storing executable software/scripts
due to the way permissions are handled beneath /shared. Further, shared (project) directories on Stanage are only available on login nodes, meaning that software installs would need to be copied to a location available to worker nodes.
Also, users may want to install software on the clusters that they want to be accessible by all cluster users.
To address these two needs we provide Community Software areas. Community Software areas are central depositories for software installations maintained by users of the system but accessible to all users of the cluster.
PIs may request the creation of a new directory in a Community Software area for software installations to be shared across multiple users. These requests should include justification for such an area including what sorts of software packages are to be installed by the research group and their value to the group or broader HPC community. When a request for a new Community Software area is granted, the requester will be granted write access to their new directory to install their software. The requester is responsible for maintaining their software and ensuring it is updated as needed.
Current locations of Community Software areas:
System |
Path |
Type |
|---|---|---|
Stanage |
|
NFS |
Bessemer |
|
NFS |
Note that:
Software installation should follow our installation guidelines where provided.
Software installations must be maintained by a responsible owner.
Software which is not actively maintained may be removed.
Default space allocation is 100GB
On Stanage, the community area is read-write on login nodes, read-only on worker nodes.
How to check your quota usage
To find out your storage quota usage for your home directory
you can use the quota command:
[te1st@login1 [stanage] ~]$ quota -u -s
Filesystem space quota limit grace files quota limit grace
storage:/export/users
3289M 51200M 76800M 321k* 300k 350k none
An asterisk (*) after your space or files usage indicates that you’ve exceeded a ‘soft quota’. You’re then given a grace period of several days to reduce your usage below this limit. Failing to do so will prevent you from using additional space or creating new files. Additionally, there is a hard limit for space and files that can never be exceeded, even temporarily (i.e. it has no grace period).
In the above example we can see that the user has exceeded their soft quota for files (‘*’) but not their hard limit for files. However, the grace period field reads ‘none’, which means the grace period for exceeding the soft quota has already expired. The user must remove/move some files from their home directory before they can create/add any more files. Also, the user is a long way from exceeding their space soft quota.
Tip
To assess what is using up your quota within a given directory, you can make use of the ncdu module on Stanage. The ncdu utility will give you an interactive display of what files/folders are taking up storage in a given directory tree.
[te1st@bessemer-node004 binary]$ quota
Size Used Avail Use% Mounted on
100G 100G 0G 100% /home/te1st
In the above, you can see that the quota was set to 100 gigabytes and all of this is in use which is likely to cause jobs to fail.
To determine usage in a particular Shared (project) directories you can use the df command like so:
[te1st@bessemer-node004 binary]$ df -h /shared/myproject1
Filesystem Size Used Avail Use% Mounted on
172.X.X.X:/myproject1/myproject1 10T 9.1T 985G 91% /shared/myproject1
Tip
To assess what is using up your quota within a given directory, you can make use of the ncdu module on Bessemer. The ncdu utility will give you an interactive display of what files/folders are taking up storage in a given directory tree.
If you exceed your filesystem quota
If you reach your quota for your home directory then many common programs/commands may cease to work as expected or at all and you may not be able to log in.
In addition, jobs may fail if you exceed your quota with a job making use of a Shared (project) directory.
In order to avoid this situation it is strongly recommended that you:
Check your quota usage regularly.
Copy files that do not need to be backed up to a Fastdata area or remove them from the clusters completely.
Recovering files from snapshots
Recovery of files and folders on Stanage is not possible as the Stanage cluster does not currently have snapshots or backups.
If you need help, please contact research-it@sheffield.ac.uk.
Home directories and Shared (project) directories are regularly snapshotted. See above for details of the snapshot schedules per area. A subset of snapshots can be accessed by HPC users from the HPC systems themselves by explicitly browsing to hidden directories e.g.
Storage area |
Parent directory of snapshots |
|---|---|
|
|
|
From within per-snapshot directories you can access (read-only) copies of files/directories. This allows you to attempt recover any files you might have accidentally modified or deleted recently.
Note that .snapshot directories are not visible when listing all hidden items within their parent directories
(e.g. using ls -a $HOME):
you need to explicitly cd into .snapshot directories to see/access them.
If you need help, please contact research-it@sheffield.ac.uk.