SAM (Sequence Alignment/Map) format is a generic format for storing large nucleotide sequence alignments. SAM aims to be a format that is

  • flexible enough to store all the alignment information generated by various alignment programs

  • simple enough to be easily generated by alignment programs or converted from existing alignment formats

  • compact in file size

  • allows most of operations on the alignment to work on a stream without loading the whole alignment into memory

  • allows the file to be indexed by genomic position to efficiently retrieve all reads aligning to a locus

SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format.


SAMtools can be activated using the module file:

module load SAMtools/1.9-foss-2018b

Note: The module file also loads the compiler Easybuild foss-2018b toolchain (including GCC 7.3.0).


Using the tutorial provided at :

$ cd ~
$ mkdir samtools-demo
$ cd samtools-demo
$ curl > sample.sam.gz
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                               Dload  Upload   Total   Spent    Left  Speed
100  371M  100  371M    0     0  29.3M      0  0:00:12  0:00:12 --:--:-- 33.2M
$ gzip -d sample.sam.gz
$ samtools view -S -b sample.sam > sample.bam
$ samtools view sample.bam | head
HWI-ST354R:351:C0UPMACXX:5:1115:20112:49057   99      1       861268  60      100M    =       861543
MC:Z:100M     MD:Z:100        RG:Z:1719PC0017_51      NM:i:0  MQ:i:60 AS:i:100        XS:i:0
$ # Further steps truncated.

Installation notes

SAMtools was compiled using EasyBuild. The module file generated is /usr/local/modulefiles/live/eb/all/SAMtools/1.9-foss-2018b and was tested as per the tutorial above.