As with all connections to the clusters, if you are not using a wired ethernet connection in a University campus building then you will need to turn on the VPN.
To transfer files to/from the clusters you can:
Use a program that supports one or both of the SCP and SFTP protocols to copy/move files to/from your own machine or from a remote machine to the cluster.
Use a Research Storage fileshare as common storage directly accessible from your own machine and from the clusters.
Use a program like
wgetto download files directly to the clusters.
qsh-vissession to open firefox and interactively download directly to clusters.
Downloading directly to the cluster may be 10x to 100x faster than doing a transfer from your local desktop or laptop (particularly if connecting remotely via VPN) as this will avoid using your local device’s internet connection which is likely a bottleneck. If you are able, you should make direct downloads to the cluster.
Transfers with SCP/SFTP¶
Secure copy protocol (SCP) is a protocol for securely transferring computer files between a local host and a remote host or between two remote hosts. It is based on the Secure Shell (SSH) protocol and the acronym typically refers to both the protocol and the command itself.
Secure File Transfer Protocol (SFTP) is also a file transfer protocol. It is based on the FTP protocol with included SSH security components.
If you need to move large files (e.g. larger than a gigabyte) from one remote machine to the cluster you should SSH in to the computer hosting the files and use scp or rsync to transfer over to the other directly as this will usually be quicker and more reliable.
If you cannot SSH into the remote machine, consider an alternative direct transfer method listed below.
Using SCP in the terminal¶
If your local machine has a terminal and the
scp (“secure copy”) command is available
you can use it to make transfers of files or folders.
Where below substitute $CLUSTER_NAME with bessemer or sharc and $USER with your cluster username. You should be prompted for your Duo MFA credentials after entering your password. Request a push notification or enter your passcode.
To upload, you transfer from your local machine to the remote cluster:
scp /home/user/file.txt $USER@$CLUSTER_NAME.shef.ac.uk:/home/$USER/
To download, you transfer from the remote cluster to your local machine:
scp $USER@$CLUSTER_NAME.shef.ac.uk:/home/$USER/file.txt /home/user/
To copy a whole directory, we add the
-r flag, for “recursive”
scp -r $USER@$CLUSTER_NAME.shef.ac.uk:/home/$USER/my_results /home/user/
FileZilla is a cross-platform client available for Windows, MacOS and Linux for downloading and uploading files to and from a remote computer.
Download and install the FileZilla client from https://filezilla-project.org. After installing and opening the program, there is a window with a file browser of your local system on the left hand side of the screen and when you connected to a cluster, your cluster files will appear on the right hand side.
To connect to the cluster, we’ll just need make a new site and enter our credentials in the General tab:
By default Filezilla will save profiles in plaintext on your machine. You must ensure you use a master password to encrypt these credentials by changing the settings as shown in these instructions.
Host: sftp://$CLUSTER_NAME.shef.ac.uk (replace $CLUSTER_NAME with bessemer or sharc.)
User: Your cluster username
Password: Your cluster password (leave blank and fill this interactively if on a shared machine.)
Port: (leave blank to use the default port)
Logon Type: Interactive
In the transfer settings tab limit the number of simultaneous connections to 1.
Save these details as a profile and then connect. You should be prompted for your Duo MFA credentials. Request a push notification or enter your passcode. You will now see your remote files appear on the right hand side of the screen. This process can be repeated to save a profile for each cluster.
You can drag-and-drop files between the left (local) and right (remote) sides of the screen to transfer files.
As you become more familiar with transferring files, you may find that the
scp is limited. The
rsync utility provides
advanced features for file transfer and is typically faster compared to both
sftp. It is a utility for
efficiently transferring and synchronizing files between storage locations including networked computers by comparing the
modification times and sizes of files. The utility is particularly useful as it can also resume failed or partial file
transfers by using the
Many users find
rsync is especially useful for transferring large and/or many files as well as creating synced
It is easy to make mistakes with
rsync and accidentally transfer files to the wrong location, sync in the wrong
direction or otherwise accidentally overwrite files. To help you avoid this, you can first use the
--dry-run flag for
rsync to show you the changes it will make for a given command.
rsync syntax is very similar to
scp. To transfer to another computer with commonly used options,
where below substitute $CLUSTER_NAME with bessemer or sharc and $USER with your cluster username.
You should be prompted for your Duo MFA credentials after entering your password. Request a push notification or
enter your passcode:
rsync -avzP /home/user/file.iso $USER@$CLUSTER_NAME.shef.ac.uk:/home/$USER/
a (archive) option preserves file timestamps and permissions among other things;
v (verbose) option gives verbose output to help monitor the transfer;
z (compression) option compresses the file during transit to reduce size and transfer time;
P (partial/progress) option preserves partially transferred files in case of an interruption
and also displays the progress of the transfer.
To recursively copy a directory, we can use the same options:
rsync -avzP /home/user/isos/ $USER@$CLUSTER_NAME.shef.ac.uk:/home/$USER/
This will copy the local directory and its contents under the specified directory on the remote system. If the trailing slash is omitted on the destination path, a new directory corresponding to the transferred directory (isos in the example) will not be created, and the contents of the source directory will be copied directly into the destination directory.
As before with
scp, to download from the cluster rather than upload simply reverse the source and destination:
rsync -avzP $USER@$CLUSTER_NAME.shef.ac.uk:/home/$USER/isos /home/user/
How to download files directly to the cluster¶
Downloading files directly to the cluster is usually the quickest and most efficient way of getting files onto the clusters. Using your home connection will be a significant speed bottleneck compared to large amounts of download bandwidth available on the clusters. Directly downloading to the cluster avoids this bottleneck!
Using a qsh-vis session¶
Users can request a
qsh-vis session on ShARC and connect to a GUI session in order to open
firefox browser window on the ShARC cluster. This will allow you to interactively navigate
the web, login to websites and download files as you would do locally.
The details for starting a
qsh-vis session can be found on the qsh-vis page.
Note that a GPU accelerated session is only possible on the ShARC cluster.
A similar less graphically performant session can be started on Bessemer by starting an interactive
session with the
srun --pty bash -i command and then opening
firefox by running the same
named command. For this to function correctly you must ensure that X11/GUI forwarding is enabled
when connecting with SSH.
Using wget / curl¶
One of the most efficient ways to download files to the clusters is to use either the curl or wget commands to download directly.
The syntax for these commands is as below:
Downloading with wget¶
Downloading with curl¶
curl -O https://software.github.io/program/files/myprogram.tar.gz
The Git software and same named command can be used to download or synchronise a remote Git repository onto the clusters. This can be achieved by setting up Git and/or simply cloning the repository you desire.
For example, cloning the source of the
[user@sharc-login4 make-git]$ git clone https://git.savannah.gnu.org/git/make.git Cloning into 'make'... remote: Counting objects: 16331, done. remote: Compressing objects: 100% (3434/3434), done. remote: Total 16331 (delta 12822), reused 16331 (delta 12822) Receiving objects: 100% (16331/16331), 5.07 MiB | 2.79 MiB/s, done. Resolving deltas: 100% (12822/12822), done.
Git is installed on the clusters and can be used on any node and all commands such as push, pull etc… are supported.
It is recommended that you use an alternative method than
lftp if possible. Using
lftp in the command line interface should be a last resort as it is a little
difficult / confusing to use.
lftp is a command-line program client for FTP, FTPS, FXP, HTTP, HTTPS, FISH, SFTP,
BitTorrent, and FTP over HTTP proxy.
If you need to login to an FTP server to
make a direct download to a cluster, you can use the
Connecting with lftp¶
Where possible please connect with the ftps protocol as this protects your username and password from hackers performing man in the middle or sniffing attacks!
Connecting to an FTP server can be achieved as follows:
When this connection is successful an
lftp prompt will appear as follows:
At this stage you can now login after being prompted for your password as follows:
lftp ftp.remotehost.com:~> login username Password:
At this stage directory listing and changing directory can be achieved using the
cd commands. By default these commands run on the remote server. To run
these commands on the local machine simply prefix each command with an
get (download) and
put (upload) commands can also be used.
Downloading with lftp¶
To download a file use the
get command as follows:
lftp email@example.com/> get myfile.txt -o mydownloadedfile.txt
Uploading with lftp¶
To upload a file use the
put command as follows:
lftp firstname.lastname@example.org/> put myfile.txt -o myuploadedfile.txt