Attention

WARNING: From 9am on 19th August until 5pm on 2nd September there will be no access to the Stanage HPC cluster.

We will send an email to notify you when Stanage is back online and available for job submission.

STAR

STAR (Spliced Transcripts Alignment to a Reference) is a software for RNA sequence alignment. STAR aligns RNA-seq reads to a reference genome using uncompressed suffix arrays. The latest STAR manual can be found at: https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf which will detail the many available command arguments.

A limited collection of STAR genomes is available from http://labshare.cshl.edu/shares/gingeraslab/www-data/dobin/STAR/STARgenomes/


Interactive usage

After connecting to Bessemer (see Establishing a SSH connection), start an interactive session with the srun --pty bash -i command.

The latest version of STAR (currently version 2.7.6a) is made available with the command:

$ module load STAR/2.7.6a-GCC-9.3.0

After this any of the STAR commands can be run from the terminal prompt. The available commands can be obtained using:

$ STAR --help

Batch usage

The following is an example batch submission script, my_job.sh, to run the executable STAR with input files from https://labshare.cshl.edu/shares/gingeraslab/www-data/dobin/STAR/STARgenomes/Human/GRCh38_Ensembl99_sparseD3_sjdbOverhang99/. The script requests 4 cores using the OpenMP parallel environment smp with a runtime of 30 minutes and 6 GB of real memory per core to generate a genome index.

#!/bin/bash
#SBATCH --job-name=STAR_smp_test
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=4
#SBATCH --mem=24000
#SBATCH --output=output_STAR_smp_4
#SBATCH --time=00:30:00
#SBATCH --mail-user=a.person@sheffield.ac.uk
#SBATCH --mail-type=ALL
module load STAR/2.7.6a-GCC-9.3.0
STAR --runThreadN $SLURM_NTASKS --runMode genomeGenerate --genomeSAindexNbases 12 --genomeDir ./ \
--genomeFastaFiles Homo_sapiens.GRCh38.dna.primary_assembly.fa --sjdbOverhang 99 \
--sjdbGTFfile Homo_sapiens.GRCh38.99.gtf --limitGenomeGenerateRAM 15000000000 --genomeSAsparseD 3 \
--limitIObufferSize 50000000 --limitSjdbInsertNsj 383200

The job is submitted to the queue by typing:

$ sbatch my_job.sh

Installation notes

Installation method

STAR was installed using Easybuild 4.4.0, build details can be found in /usr/local/packages/live/eb/STAR/2.7.6a-GCC-9.3.0/easybuild/


Testing

Testing has been conducted by running the genome indices generation job as detailed in the batch job above.

The output logs should resemble: https://labshare.cshl.edu/shares/gingeraslab/www-data/dobin/STAR/STARgenomes/Human/GRCh38_Ensembl99_sparseD3_sjdbOverhang99/log


Modulefiles

The module file is on the system at /usr/local/modulefiles/live/eb/all/STAR/2.7.6a-GCC-9.3.0.