Bowtie for single-end reads

Description

This tool aligns single-end reads to publicly available genomes or transcriptomes. If you would like us to add new reference genomes to Chipster, please contact us. If you would like to align reads against your own datasets, please use the tool "Bowtie for single end reads and own genomes".

Parameters

Genome or transcriptome (human genome hg19, mouse genome mm9, rat genome rn4, mouse miRBase17) [mouse genome mm9]
Number of mismatches allowed (0, 1, 2, 3) [2]
Consider mismatches only in the seed region (yes, no) [no]
Length of the seed region (5-50) [28]
Allowed total of mismatch qualities (10-100) [70]
Quality value format used (Sanger, Illumina GA v1.3 or later) [Sanger]
How many best category hits is a read allowed to have (1, 2, no limit) [no limit]
How many valid alignments are reported per read (1, 2, 3) [1]
Put multireads to a separate file (yes, no) [no]
Put unaligned reads to a separate file (yes, no) [no]

Details

Bowtie aligns reads to a reference sequence such as genome or transcriptome. There are two modes: mismatches are considered either throughout the read (this is the so-called v-mode when running Bowtie on command line), or only in the user-defined seed region in the high-quality left end of the read (n-mode in command-line Bowtie). In the latter case also base quality values at all mismatch positions are taken into account.

The Chipster implementation of Bowtie always uses the --best and --strata options. The latter refers to the alignment's quality category (stratum), which is defined by the number of mismatches along the read (or in the seed region). Chipster's way of running Bowtie with the --best and --strata options has the following consequences: Bowtie is guaranteed to report the best alignment(s) as opposed to reporting the first alignment found at random. If you allowed the read to have mismatches, only the alignments with least mismatches are reported. Note that if you allow the read to have only 1 best category hit, it might still have other alignments which contain more mismatches (and are hence in lower categories).

If a read has a higher number of reportable alignments than what was allowed, the user can request these multireads to be put in a separate fastq file for further inspection. Similarly, the user can request that reads which don't align are put in a separate fastq file.

After running Bowtie, Chipster converts the alignment file to BAM format, and sorts and indexes it using the SAMtools package. This way the results are ready to be visualized in the genome browser.

Output

This tool returns the alignment in BAM format and an index file for it (.bai). It also produces a log file, which allows you to see what percentage of the reads align with the selected parameter settings. Optionally also fastq files are produced for the unaligned reads, or reads that align to multiple locations.

Reference

Langmead et al. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome (2009) Genome Biology 10:R25