Attention

WARNING: From 9am on 19th August until 5pm on 2nd September there will be no access to the Stanage HPC cluster.

We will send an email to notify you when Stanage is back online and available for job submission.

SAMtools

SAM (Sequence Alignment/Map) format is a generic format for storing large nucleotide sequence alignments. SAM aims to be a format that is

  • flexible enough to store all the alignment information generated by various alignment programs

  • simple enough to be easily generated by alignment programs or converted from existing alignment formats

  • compact in file size

  • allows most of operations on the alignment to work on a stream without loading the whole alignment into memory

  • allows the file to be indexed by genomic position to efficiently retrieve all reads aligning to a locus

SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format.

Usage

SAMtools can be activated using the module file:

module load SAMtools/1.9-foss-2018b

Note: The module file also loads the compiler Easybuild foss-2018b toolchain (including GCC 7.3.0).

Test

Using the tutorial provided at http://quinlanlab.org/tutorials/samtools/samtools.html :

$ cd ~
$ mkdir samtools-demo
$ cd samtools-demo
$ curl https://s3.amazonaws.com/samtools-tutorial/sample.sam.gz > sample.sam.gz
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                               Dload  Upload   Total   Spent    Left  Speed
100  371M  100  371M    0     0  29.3M      0  0:00:12  0:00:12 --:--:-- 33.2M
$ gzip -d sample.sam.gz
$ samtools view -S -b sample.sam > sample.bam
$ samtools view sample.bam | head
HWI-ST354R:351:C0UPMACXX:5:1115:20112:49057   99      1       861268  60      100M    =       861543
375   TCCCTCACAGGGTCTGCCTCGGCTCTGCTCGCAGGGAAAAGTCTGAAGACGCTTATGTCCAAGGGGATCCTGCAGGTGCATCCTCCGATCTGCGACTGCC
CCCFFFFFHHHGFHJIIJJJJJIJJJJJJJJIIJJIIJJIGCHCHGGIGIIJIJGHGFFFFFFFDD@BDCCCDDDDDCDDECC@C9<@BBDDDDDDD59>
MC:Z:100M     MD:Z:100        RG:Z:1719PC0017_51      NM:i:0  MQ:i:60 AS:i:100        XS:i:0
$ # Further steps truncated.

Installation notes

SAMtools was compiled using EasyBuild. The module file generated is /usr/local/modulefiles/live/eb/all/SAMtools/1.9-foss-2018b and was tested as per the tutorial above.