Installing software to the clusters
As Stanage and Bessemer are general purpose HPC clusters, we provide and maintain only the most essential and most popular applications on them.
We are aware of our users’ need to run applications that are specific to their own subject areas of research and as such we permit the installation of software within users’ personal directories and special shared areas on the clusters for public use.
This option should be seen as a service without support as we will expect such users to be able to tackle the problems encountered during installations on their own. We will however help make such software available to other Stanage and Bessemer users by copying/installing scripts to shared locations.
Policy on user-installed software on University of Sheffield HPC systems
- Users should endeavour to download source code or software binaries
produced by trusted developers/vendors and
acquired from trusted repositories/locations.
Users should keep software up to date where reproducibility is not a concern.
Users should remove any software that is definitely no longer needed.
To discuss and get support regarding these requirements please contact research-it@sheffield.ac.uk
General background prequisites
Tip
If you are not familiar with basic computer architecture we highly recommend reading our General Computer Architecture Quick Start page before continuing.
What is source code?
Source code is the collection of code written in a human readable programming language for a given software package. Source code is transformed by a compiler into machine code that can be executed by a computer.
What is a compiler or compiling?
The shortest description of what a compiler is / what compiling is that it is a process that takes human made source code and turns it into machine code that will run on a computer. Machine code has to be specific to a given processor’s architecture and the instruction sets it supports (aka, instructions/operations that a CPU can do) which is why you may need to compile your code for a specific instruction set (different processor manufacturers design different processors sometimes with different instruction sets) .
For example, you are probably aware that mobile phones use ARM processors not Intel or AMD processors that you will typically find in a desktop or laptop computer. This difference in processors and their instruction sets is one of the reasons why applications that run on phones cannot typically run on desktop computers.
Within research, you may find certain clusters using different processor architectures which have been designed for optimal performance at certain tasks using different instruction sets. e.g. Power 9 architecture on the BEDE cluster.
This also means that to run software on these machines with different architecures you may need to recompile the software from source code if no binaries for that architecture are provided!
You may be wondering why you need to compile some software but not others, this is due to the differences between compiled and interpreted languages , but this falls out of the scope of this page.
What are binaries?
When referring to software, software binaries, binary installations or binary downloads are software packages supplied to you pre-compiled by the developer for a specific processor / instruction set. This means that if you wish to use a binary software build you must check that you download and install the correct version that matches your machine’s processor / architecture.
What about software dependencies?
Many software packages have numerous libraries or other software packages on which they are dependant in order to function.
This means that the installation of one software package may require multiple packages requiring installation and loading prior or existing software modules provided on the cluster may need to be loaded prior in order for the software to install or function correctly.
What is a Linux shell?
A shell is a program that takes commands typed from the keyboard and gives them to the computer to run. Historically the shell was the only user interface available on a Unix-like system such as Linux. In the present day, graphical user interfaces (GUIs) are available in addition to interfaces such as the shell.
Most Linux operating systems use a program called bash (the Bourne Again SHell, an enhanced version of the original Unix shell program, sh, written by Steve Bourne) as the shell program. There are other shell programs available for Linux systems if desired by a user. Examples include: ash, dash, csh, tcsh, ksh and zsh.
What are environment variables?
In Linux based operating systems, environment variables are dynamic named values stored within the system which are used by shells or subshells (your terminal) to facilitate functionality. Simply put, they are variables with a name and value which perform a function in how the operating system and applications work.
These variables have a simple format:
KEY=value
KEY="Some other value"
KEY=value1:value2
Important
The variable names are case sensitive and by convention they are UPPER CASE.
If a variable has multiple values they should be separated by a colon
:
.Variables do not have spaces around the equals
=
sign.
Note that environment variables are variables that are available system-wide and are inherited by all spawned child processes and shells where shell variables are variables that apply only to the current shell instance. Each shell such as bash (the default on the clusters), has its own set of internal shell variables.
Listing environment variables
env – This command allows you to run another program in a custom environment without modifying the current one. When used without an argument it will print a list of the current environment variables.
printenv – This command prints all or the specified environment variables.
echo $MYVARIABLE - The command echo when supplied with a variable name prefixed with
$
will print that variable. An alternative syntax would be echo ${MYVARIABLE}. Variables can also be utilized in bash scripts in this manner.
Setting environment variables
Manually setting environment variables is trivial and can be accomplished with the commands below.
set – The command sets or unsets shell variables. When used without an argument it will print a list of all variables including environment and shell variables, and shell functions.
unset – The command deletes shell and environment variables.
export – The command sets environment variables.
Caution
Setting or changing environment variables can lead to a corrupted shell environment which can leave you unable to login or run programs. Manually changing values should be avoided in favour of using the modules system.
If you find your shell environment is behaving oddly, programs are no longer available and you suspect you may have corrupted your current shell environment by changing environment variables in the terminal you can simply log out and log back in to clear the problem.
How do environment variables relate to installing software?
The usage of environment variables is critical to not only installing the software where you desire but also to making those software executables available to use in your shell.
A few of the most important variables are listed below with HOME
, USER
and LANG
variables
useful during installlation (e.g. setting directories in which to install) and the PATH
and
LD_LIBRARY_PATH
variables used to add libraries or executables to your shell.
The
HOME
environment variable contains the path of your user’s home directory.The
USER
environment variable contains the username of your current user.The
PATH
environment variable is a list of directories where your executables are located, adding a directory to this list makes any of the executables in that directory available from the terminal via their name.The
LD_LIBRARY_PATH
functions similarly, but is a list of directories where your libraries are located. Adding a directory to this list makes any of the libraries in that directory available to programs.
Installing software from binaries
Caution
Installing from pre-compiled binaries does not remove the need to supply correctly versioned dependencies (e.g. shared libraries).
Using incorrectly versioned dependencies may allow a program to function but this could lead to instability and software errors.
Downloading your binaries
The first step of completing and installation from binaries on the clusters is to download the binaries. In general there are few methods for downloading your binaries which will be detailed below in the prefered order.
1. Downloading binaries for the cluster using Yumdownloader
Yumdownloader is an application installed on the cluster which will allow you to download RPM packaged applications directly from the cluster operating system’s repositories.
This is the best method as this will natively ensure that you get a version that is not only compatible with the operating system but this will also ensure that the package is downloaded from a trusted location.
As an example the following command will download the GNU Make RPM to your local folder indicating where it is downloading the RPM from as well as the full name of the file downloaded.
Important
GNU Make is already available on our clusters! Any further examples of installing or compiling GNU Make are examples only, you do not need to download or install Make.
[user@node004 [stanage] yumpackages]$ yumdownloader make
Loaded plugins: fastestmirror, priorities
Loading mirror speeds from cached hostfile
* epel: ftp.nluug.nl
make-3.82-24.el7.x86_64.rpm | 421 kB 00:00:00
[user@node004 [stanage] yumpackages]$
This method will automatically check the package integrity and check it also has valid signatures.
2. Downloading binaries from pkgs.org
pkgs.org is a website which allows a user to search for and download binary packages for numerous Linux and Unix operating systems. Using this website you will be able to query for CentOS 7 x86_64 compatible packages and then download them.
Caution
It is possible to download and use packages for different versions of CentOS (or RHEL as both operating systems are binary compatible) but this is not recommended and may lead to application instability or errors.
Using GNU Make again as an example, the required page can be found by searching as:
https://centos.pkgs.org/7/centos-x86_64/make-3.82-24.el7.x86_64.rpm.html
Looking at the Download section, the binary package download URL can be seen as:
http://mirror.centos.org/centos/7/os/x86_64/Packages/make-3.82-24.el7.x86_64.rpm
This RPM can now be downloaded using the wget
command on the cluster:
[user@node004 [stanage] yumpackages]$ wget http://mirror.centos.org/centos/7/os/x86_64/Packages/make-3.82-24.el7.x86_64.rpm
--2021-07-15 12:19:18-- http://mirror.centos.org/centos/7/os/x86_64/Packages/make-3.82-24.el7.x86_64.rpm
Resolving mirror.centos.org (mirror.centos.org)... 85.236.43.108, 2604:1380:2001:d00::3
Connecting to mirror.centos.org (mirror.centos.org)|85.236.43.108|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 430712 (421K) [application/x-rpm]
Saving to: ‘make-3.82-24.el7.x86_64.rpm’
100%[==================================================================================================>] 430,712 --.-K/s in 0.1s
2021-07-15 12:19:18 (3.74 MB/s) - ‘make-3.82-24.el7.x86_64.rpm’ saved [430712/430712]
Because we have downloaded this manually we should now verify both the package integrity and that the
package has been signed as trusted. We can do this with the rpm --checksig
command.
[user@node004 [stanage] yumpackages]$ rpm --checksig make-3.82-24.el7.x86_64.rpm
make-3.82-24.el7.x86_64.rpm: rsa sha1 (md5) pgp md5 OK
Hint
The pkgs.org website will also show the dependencies of a package in the Requires section. This can be very useful for resolving package / library dependencies.
3. Downloading binaries from a vendor / package maintainer
If you have software from a vendor who does not supply source code or a package maintainer has provided
binaries that are not supplied as part of the normal package repositories for the operating system you
will typically be supplied by them with a RPM file (package.rpm
) or a compressed tarball (package.tar.gz
)
from their website, via email or similar.
You may be able to use the wget
command to download this directly to the cluster or may have to
transfer this manually using SCP or similar. Once downloaded you should verify the software download’s
integrity and validity.
Verifying software package download integrity
Typically any downloaded software packages will be supplied with a checksum value (usually MD5 or SHA256) and you should check that this checksum is correct after upload to the cluster to verify the integrity of the uploaded files.
An example of checking the integrity of the Make RPM is shown below using the
md5sum
and sha256sum
commands:
[user@node004 [stanage] yumpackages]$ md5sum make-3.82-24.el7.x86_64.rpm
c678cfe499cd64bae54a09b43f600231 make-3.82-24.el7.x86_64.rpm
[user@node004 [stanage] yumpackages]$ sha256sum make-3.82-24.el7.x86_64.rpm
d4829aff887b450f0f3bd307f782e062d1067ca4f95fcad5511148679c14a668 make-3.82-24.el7.x86_64.rpm
At this stage if being thorough you should check that any vendor or package maintainer signatures on the downloaded binary packages are valid.
If the vendor or maintainer has supplied a tarball or similar with an associated signature file (typically
packagename.tar.gz.asc
or packagename.tar.gz.sig
) then you can use gpg to check if it is valid as
demonstrated below with the GNU Make project’s source tarball:
[user@node004 [stanage] make]$ gpg --verify make-4.3.tar.gz.sig make-4.3.tar.gz
gpg: Signature made Sun 19 Jan 2020 22:24:43 GMT using RSA key ID DB78137A
gpg: Good signature from "Paul D. Smith <paul@mad-scientist.net>"
gpg: aka "Paul D. Smith <psmith@gnu.org>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 6D4E EB02 AD83 4703 510B 1176 80CB 727A 20C7 9BB2
Subkey fingerprint: B250 8A90 102F 8AE3 B12A 0090 DEAC CAAE DB78 137A
The GNU Make project documentation shows Paul D. Smith as the project maintainer and navigating to this personal site, http://make.mad-scientist.net/switching-gpg-keys/, shows the matching primary key fingerprint as expected and you can proceed with installing from the package.
Warning
Because we have not set Paul’s public key as trusted and signed his public key with our own private key, GPG will warn you that while the package is signed with Paul’s key, his key is not trusted nor can it verify it was indeed Paul’s key in the first place.
As it is possible for anyone to generate a signing key with anyone’s name or email, you must verify the public key that signed the package is Paul’s e.g. by confirming with an alternate source such as his website which publishes the expected fingerprint.
In order for GPG to trust the signature, several further commands are needed to add the key to your key chain, mark it as trusted then sign it with your own private key to establish a chain of trust / confirm to GPG you trust the key and have verified that it was indeed generated by Paul.
As this has several steps and is required it is out of scope of this page.
If you know that the vendor or maintainer already signs their other releases into the Centos repository and has supplied you an RPM then alternatively you can check signatures as detailed previously.
Unpacking your binaries
Unpacking binaries is typically an easy process but will depend on how they have been packaged, examples of unpacking an RPM and a Tarball are given below.
Unpacking an RPM
Unpacking an RPM is achieved by using the rpm2cpio
and cpio
commands in concert as shown below.
This will unpackage the RPM into the current directory following a localised structure which would
otherwise be where this package would be installed conventionally.
i.e. ./usr/bin/gmake
rather than /usr/bin/gmake
[user@node004 [stanage] yumpackages]$ rpm2cpio make-3.82-24.el7.x86_64.rpm | cpio -idmv
./usr/bin/gmake
./usr/bin/make
./usr/share/doc/make-3.82
./usr/share/doc/make-3.82/AUTHORS
./usr/share/doc/make-3.82/COPYING
./usr/share/doc/make-3.82/NEWS
./usr/share/doc/make-3.82/README
*SNIP*
./usr/share/info/make.info-1.gz
./usr/share/info/make.info-2.gz
./usr/share/info/make.info.gz
./usr/share/man/man1/gmake.1.gz
./usr/share/man/man1/make.1.gz
2278 blocks
Unpacking a tarball
Unpacking a tarball is straightforward and is achieved using the tar command. Typically tarballs will be
compressed with either GZip (tar.gz
) or BZip (tar.bz2
) and can be decompressed into the current
directory by using the matching tar command arguments.
For GZip compression the format of this command is:
[user@node004 [stanage] tarpackages]$ tar -xzf mytarball.tar.gz
For BZip compression the format of this command is:
[user@node004 [stanage] tarpackages]$ tar -xjf mytarball.tar.bz2
No example of this process is shown as the commands do not have terminal output unless there is an error.
Making your binaries available in the shell
At this stage you can typically move the unpackaged binaries as desired and any executables (in ./bin
)
or libraries (typically in ./lib and ./lib64 ) can be added to PATH
or LD_LIBRARY_PATH
using one of the two methodologies mentioned in the
Making installed software available to execute section.
Installing software by compiling from source
Downloading the source code
The first step of completing and installation from source on the clusters is to download the source code. In general there are few methods for downloading the source code for a project which will be detailed below.
Typically source code will be made available from the maintainer’s FTP/HTTP servers or mirrors in the form of a compressed tarball or hosted on their chosen version control system site such as Github , Gitlab , Atlassian Bitbucket and GNU Savannah among many others.
Downloading source Tarballs
Downloading a source tarball is typically straightforwardand you can simply navigate to a package maintainer’s website, go into the download area and then download a tarball and it’s signature file if available.
For example, the GNU make project download area can be found at https://ftp.gnu.org/gnu/make/ or on one of the numerous mirror websites.
You may be able to use the wget
command to download this directly to the cluster or may have to
transfer this manually using SCP or similar. Once downloaded you should verify the software download’s
integrity and validity.
Verifying software package download integrity
Typically any downloaded software packages will be supplied with a checksum value (usually MD5 or SHA256) and you should check that this checksum is correct after upload to the cluster to verify the integrity of the uploaded files.
An example of checking the integrity of the Make RPM is shown below using the
md5sum
and sha256sum
commands:
[user@node004 [stanage] yumpackages]$ md5sum make-3.82-24.el7.x86_64.rpm
c678cfe499cd64bae54a09b43f600231 make-3.82-24.el7.x86_64.rpm
[user@node004 [stanage] yumpackages]$ sha256sum make-3.82-24.el7.x86_64.rpm
d4829aff887b450f0f3bd307f782e062d1067ca4f95fcad5511148679c14a668 make-3.82-24.el7.x86_64.rpm
At this stage if being thorough you should check that any vendor or package maintainer signatures on the downloaded binary packages are valid.
If the vendor or maintainer has supplied a tarball or similar with an associated signature file (typically
packagename.tar.gz.asc
or packagename.tar.gz.sig
) then you can use gpg to check if it is valid as
demonstrated below with the GNU Make project’s source tarball:
[user@node004 [stanage] make]$ gpg --verify make-4.3.tar.gz.sig make-4.3.tar.gz
gpg: Signature made Sun 19 Jan 2020 22:24:43 GMT using RSA key ID DB78137A
gpg: Good signature from "Paul D. Smith <paul@mad-scientist.net>"
gpg: aka "Paul D. Smith <psmith@gnu.org>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 6D4E EB02 AD83 4703 510B 1176 80CB 727A 20C7 9BB2
Subkey fingerprint: B250 8A90 102F 8AE3 B12A 0090 DEAC CAAE DB78 137A
The GNU Make project documentation shows Paul D. Smith as the project maintainer and navigating to this personal site, http://make.mad-scientist.net/switching-gpg-keys/, shows the matching primary key fingerprint as expected and you can proceed with installing from the package.
Warning
Because we have not set Paul’s public key as trusted and signed his public key with our own private key, GPG will warn you that while the package is signed with Paul’s key, his key is not trusted nor can it verify it was indeed Paul’s key in the first place.
As it is possible for anyone to generate a signing key with anyone’s name or email, you must verify the public key that signed the package is Paul’s e.g. by confirming with an alternate source such as his website which publishes the expected fingerprint.
In order for GPG to trust the signature, several further commands are needed to add the key to your key chain, mark it as trusted then sign it with your own private key to establish a chain of trust / confirm to GPG you trust the key and have verified that it was indeed generated by Paul.
As this has several steps and is required it is out of scope of this page.
Unpacking a tarball
Unpacking a tarball is straightforward and is achieved using the tar command. Typically tarballs will be
compressed with either GZip (tar.gz
) or BZip (tar.bz2
) and can be decompressed into the current
directory by using the matching tar command arguments.
For GZip compression the format of this command is:
[user@node004 [stanage] tarpackages]$ tar -xzf mytarball.tar.gz
For BZip compression the format of this command is:
[user@node004 [stanage] tarpackages]$ tar -xjf mytarball.tar.bz2
No example of this process is shown as the commands do not have terminal output unless there is an error.
With the files now decompressed and available on the local file system you are ready to compile your software.
Downloading source code with Git
Downloading source code with Git is straightforward with the Git program already installed on the clusters. Once you have located the source code repository of interest you need only clone it to your local filesystem.
An example of this process is shown with the GNU Make project. The GNU make project source code is hosted at https://git.savannah.gnu.org/cgit/make.git . Opening this page in the web browser will detail some important infomation needed in order to download and select the version of Make we are interested in.
First clone the project using Git and the .git
URL above as follows:
[user@login1 [stanage] make-git]$ git clone https://git.savannah.gnu.org/git/make.git
Cloning into 'make'...
remote: Counting objects: 16331, done.
remote: Compressing objects: 100% (3434/3434), done.
remote: Total 16331 (delta 12822), reused 16331 (delta 12822)
Receiving objects: 100% (16331/16331), 5.07 MiB | 2.79 MiB/s, done.
Resolving deltas: 100% (12822/12822), done.
This has cloned the latest version of the master branch into our local filesystem. Now we can instruct Git to checkout a specific version of Make via tags after entering the subdirectory that has been cloned. The available tags and branches will be shown on the source code repository webpage.
[user@login1 [stanage] make-git]$ cd make
[user@login1 [stanage] make]$ git checkout tags/4.3
Note: checking out 'tags/4.3'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:
git checkout -b new_branch_name
HEAD is now at f430a65... GNU Make release 4.3
The files on the local file system are now version 4.3, have been cloned over HTTPS and Git will have ensured the integrity of the downloaded files automatically. You are now able to compile your software.
Compiling your source code into binaries
Compiling from source is normally straightforward assuming that the prerequisites that a software package has are fulfilled correctly.
Care must be taken to read through the documentation provided in the
software package files which are usually called README
or INSTALL
in the top level directory of
the downloaded files. These files will dictate what specific instructions, compilers, build systems and
versions are required for a successful compile.
With this in mind, the process is very similar for most packages and will require you to first module load appropriate versions of GCC and / or CMake, potentially run a specific script (e.g. ./autogen.sh or ./build), configure the build options and then compile the source code.
e.g. compiling a more modern version of the make
program on the Stanage cluster:
Note
Make is a tool which controls the generation of executables and other non-source files of a program from the program’s source files.
Below shows the make
program provided by the base operating system using GCC 8.2 to compile a more
modern version of itself. This may seem quirky or recursive but is normal and will not lead to
conflicts or issues.
[user@node001 [stanage] make]$ cd make
[user@node001 [stanage] make]$ mkdir ./build && cd ./build
[user@node001 [stanage] make]$ module load GCC/12.2.0
[user@node001 [stanage] make]$ ../configure --prefix=$HOME/software/installed/make
[user@node001 [stanage] make]$ make -j $NSLOTS
[user@node001 [stanage] make]$ make -j $NSLOTS check
[user@node001 [stanage] make]$ make -j $NSLOTS install
A
build
directory is made and then used to keep the source files unpolluted.The
../configure
script is called from the directory above with the--prefix
option set to where we want the installed files to be located.The
make
program provided the base operating system is then called 3 times. The first instance calls the compiler to compile the code, the second instance runs the maintainers’ check scripts to verify successful compilation and the final instance is called to then install the files.The
-j $NSLOTS
argument supplied tomake
instructsmake
to use multiple cores with the$NSLOTS
variable containing the number of cores currently available in the requested interactive or batch session.Warning
Care should be taken to observe the log output generated by this process to verify successful compilation and any indicated warnings or failed checks which could negatively affect your work or result in hard to diagnose unexpected behaviour.
Making your compiled binaries available in the shell
At this stage you can typically move the generated binaries as desired and any executables (in ./bin
)
or libraries (typically in ./lib and ./lib64 ) can be added to the PATH
or LD_LIBRARY_PATH
using one of the two methodologies mentioned in the following section.
Making installed software available to execute
Software on the HPC cluster can be made available using one of the two methods below:
using your .bashrc
file or making a custom module file (preferred) to enable multiple
versions of the same software without conflicts.
The .bashrc file and its purpose
Caution
Editing the .bashrc
file can lead to a corrupted shell environment which can leave
you unable to login or run programs.
Please take care if editing this file and consider using the
modules system to add directories to the PATH
and
LD_LIBRARY_PATH
to avoid inadvertent mistakes.
If you find your shell environment is behaving oddly, programs are no longer available and
you suspect you may have corrupted your shell environment by editing the .bashrc
file you
can reset it with the command resetenv
then logging out and back in.
The .bashrc
file is a hidden script file located in a user’s home directory which runs
when the user logs in using the bash shell. The contents of .bashrc
can be changed to define
functions, command aliases, and customize the bash shell to the user’s liking.
As this file is executed when the user logs in, it can be customised to add additional directories
to the PATH
and LD_LIBRARY_PATH
in order to make software available to the shell.
Adding a directory such as a personal installation directory with executables and libraries can be achieved as shown in the next section.
Making software available via the .bashrc file
Software can be made available using your .bashrc
file and adding/editing any
relevant environment variables.
Caution
We do not recommend editing your .bashrc
file as this could result in corrupting your
shell environment. If possible make use of the modules system.
Assuming you have installed your software in a folder with the path $HOME/software/installs/mysoftware
and the software has added a bin
and lib
folder. You would adjust your file to be:
export PATH=$HOME/software/installs/mysoftware/bin:$PATH
export LD_LIBRARY_PATH=$HOME/software/installs/mysoftware/lib:$LD_LIBRARY_PATH
If you are installing libraries and software that are dependencies, using 64 bit software, need to set a variable for a license server/file etc… you may also need to use other environment variables pointing at different paths e.g.
export LD_LIBRARY_PATH=$HOME/software/installs/mysoftware/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOME/software/installs/mysoftware/lib64:$LD_LIBRARY_PATH
export LIBRARY_PATH=$HOME/software/installs/mysoftware/lib:$LIBRARY_PATH
export LIBRARY_PATH=$HOME/software/installs/mysoftware/lib64:$LIBRARY_PATH
export PKG_CONFIG_PATH=$HOME/software/installs/mysoftware/lib/pkgconfig:$PKG_CONFIG_PATH
export PKG_CONFIG_PATH=$HOME/software/installs/mysoftware/lib64/pkgconfig:$PKG_CONFIG_PATH
export PKG_CONFIG_PATH=$HOME/software/installs/mysoftware/share/pkgconfig:$PKG_CONFIG_PATH
export ACLOCAL_PATH=$HOME/software/installs/mysoftware/share/aclocal:$ACLOCAL_PATH
export CMAKE_PREFIX_PATH=$HOME/software/installs/mysoftware/:$CMAKE_PREFIX_PATH
export CPLUS_INCLUDE_PATH=$HOME/software/installs/mysoftware/include:$CPLUS_INCLUDE_PATH
export CPATH=$HOME/software/installs/mysoftware/include:$CPATH
export MY_SOFTWARE_LICENSE_PATH=$HOME/software/licenses/mysoftware/license.lic
Environment ‘Modules’ and their purpose
‘Environment Modules’ are the mechanism by which much of the software is made available to the users of the Stanage and Bessemer clusters. You are able to load and unload modules which make specific configurations of software available in a structured way which can avoid conflicts between different versions of the same software.
They do this by adding and removing software to the the PATH
and LD_LIBRARY_PATH
environment
variables as well as set any additional required environment varibles, configuration or license files using
the module load
or module unload
functionality.
Module files are written in Lua on Stanage and TCL on Bessemer. To see examples, check the module paths with echo $MODULEPATH
to get an idea of what these should look like.
Further detail on the environment modules system in use on the clusters can be found on the modules page.
Making software available via a custom module file
If you wish to use the modules system with personal
module files you can add a directory called modules to your home directory
mkdir $HOME/modules
and populate this with your own module files.
To make these available automatically you can then add the module use $HOME/modules
command to your .bashrc
file.
You can generate a basic module file using the basic Lua directives to set local
variables,
export these to your shell with the setenv
function and prepend paths to your existing environment
variables with the prepend_path
function.
Warning
Module files are not aware of bash shell variables unless you import them using the os.getenv
function and set a Lua variable based on them.
e.g. the following example imports our shell environment HOME
variable and sets the Lua
HOME
variable with it. The Lua HOME
variable is used to set the Lua variable MY_PROGRAM_DIR
(the software’s installation directory). The Lua MY_PROGRAM_DIR
variable is then used to add the program’s
bin
directory to your shell environment PATH
variable with the prepend_path
function.
-- Import the shell environment HOME variable into a Lua variable.
local HOME = os.getenv("HOME")
-- Set a local Lua variable for the program directory.
local MY_PROGRAM_DIR = HOME .. "/software/installs/my_new_program"
-- Export the program directory to the shell environment.
setenv("MY_PROGRAM_DIR", MY_PROGRAM_DIR)
-- Prepend the program's bin directory to the shell environment PATH variable.
prepend_path("PATH", MY_PROGRAM_DIR .. "/bin")
Much like using a .bashrc
file with the export command, we can add the required variables and directives
to a custom module file. For example, if called CustomModule
and saved in $HOME/modules/
may
look something like:
------------------------------------------------------------------------------------------------
/users/my_username/software/installs/my_new_program.lua:
------------------------------------------------------------------------------------------------
-- Provide help text for the module.
help([[
Description
===========
Makes my newly installed program available.
More information
================
- Homepage: https://www.my-new-programme.com
]])
-- Describe the module.
whatis("Description: Makes my newly installed program available.")
whatis("Homepage: https://www.my-new-programme.com")
whatis("URL: https://www.my-new-programme.com")
-- Specify a conflicting module.
conflict("CustomModule")
-- Load any dependencies.
load("GCC/10.2")
load("CMake/3.18.4-GCCcore-10.2.0")
-- Set a program root directory Lua variable MY_PROGRAM_DIR to simplify prepend_path directives.
-- Reminder: setting an environment variable with setenv does not set the equivalent Lua variable!
-- Reminder: setting a Lua variable does not set the equivalent shell environment variable either!
-- Note no trailing slash is required for MY_PROGRAM_DIR as we are using a / on the prepend_path directives.
local MY_PROGRAM_DIR = "/users/my_username/software/installs/my_new_program"
setenv("MY_PROGRAM_DIR", MY_PROGRAM_DIR)
setenv("MY_SOFTWARE_LICENSE_PATH", "/users/my_username/software/licenses/mysoftware/license.lic")
-- Add directories to environment variables.
prepend_path("PATH", pathJoin(MY_PROGRAM_DIR, "bin"))
prepend_path("LIBRARY_PATH", pathJoin(MY_PROGRAM_DIR, "lib"))
prepend_path("LD_LIBRARY_PATH", pathJoin(MY_PROGRAM_DIR, "lib"))
prepend_path("PKG_CONFIG_PATH", pathJoin(MY_PROGRAM_DIR, "lib/pkgconfig"))
prepend_path("CMAKE_PREFIX_PATH", MY_PROGRAM_DIR)
You can generate a basic module file using the basic TCL directives to set
variables,
export these to your shell with setenv
and prepend paths to your existing environment
variables with prepend-path
.
Warning
Module files are not aware of bash shell variables unless you import them from the env array and set a TCL variable based on them.
e.g. the following example imports our shell environment HOME
variable and sets the TCL
HOME
variable with it. The TCL HOME
variable is used to set the TCL variable MY_PROGRAM_DIR
(the software’s installation directory). The TCL MY_PROGRAM_DIR
variable is then used to add the program’s
bin
directory to your shell environment PATH
variable with the prepend-path
directive.
set HOME $::env(HOME)
set MY_PROGRAM_DIR $HOME/software/installs/my_new_program
prepend-path PATH $MY_PROGRAM_DIR/bin
Much like using a .bashrc
file with the export command, we can add the required variables and directives
to a custom module file. For example, if called CustomModule
and saved in $HOME/modules/
may
look something like:
#%Module1.0#####################################################################
##
## My newly installed program module file
##
## Module file logging - this is TUoS cluster specific!
source /usr/local/etc/module_logging.tcl
##
proc ModulesHelp { } {
puts stderr "Makes my newly installed program available."
}
module-whatis "Makes my newly installed program available."
## Load any dependencies
module load GCC/10.2
module load CMake/3.18.4-GCCcore-10.2.0
## Set a program root directory TCL variable MY_PROGRAM_DIR to simplify prepend-path directives.
## **Reminder** setting an environment variable with setenv does not set the equivalent TCL variable!
## **Reminder** setting a TCL variable does not set the equivalent shell environment variable either!
## Note no trailing slash is required for MY_PROGRAM_DIR as we are using a / on the prepend-path directives.
set MY_PROGRAM_DIR /home/my_username/software/installs/my_new_program
setenv MY_PROGRAM_DIR $MY_PROGRAM_DIR
setenv MY_SOFTWARE_LICENSE_PATH /home/my_username/software/licenses/mysoftware/license.lic
prepend-path PATH $MY_PROGRAM_DIR/bin
prepend-path LIBRARY_PATH $MY_PROGRAM_DIR/lib
prepend-path LD_LIBRARY_PATH $MY_PROGRAM_DIR/lib
prepend-path PKG_CONFIG_PATH $MY_PROGRAM_DIR/lib/pkgconfig
prepend-path CMAKE_PREFIX_PATH $MY_PROGRAM_DIR
Hint
If you get warnings about missing file paths please ensure the file path exists and/or you have not made a mistake
when defining your TCL variables. (Remember the difference between set
and setenv
directives and that one
does not set the other.)
If the module use command (module use $HOME/modules
) is applied in your .bashrc
file you could now load this module by running:
$ module load CustomModule
And unload with:
$ module unload CustomModule
Modulefiles make it easy to add many versions of the same software easily via duplication and simple editing without the risk of permanently corrupting your shell environment. Further info on the modules system can be found on the modules page.
Why should I install from source?
Further performance optimisations may be available for your chosen cluster / computer.
Dependencies may not be available with the versions required for a binary installation.
The version of the software you desire has no precompiled binaries available.
Your machine architecture does not have any precompiled binaries available.
What alternative methods exist?
Conda
Pip