Bowtie for single-end reads and own genome

Description

This tool allows you align single-end reads against your own genomes or transcriptomes, which you have to supply in fasta format. If you would like to align reads against publicly available genomes supplied by Chipster, please use the tool "Bowtie".

Parameters

Details

Bowtie aligns reads to a reference sequence such as genome or transcriptome. There are two modes: mismatches are considered either throughout the read (this is the so-called v-mode when running Bowtie on command line), or only in the user-defined seed region in the high-quality left end of the read (n-mode in command-line Bowtie). In the latter case also base quality values at all mismatch positions are taken into account.

The Chipster implementation of Bowtie always uses the --best and --strata options. The latter refers to the alignment's quality category (stratum), which is defined by the number of mismatches along the read (or in the seed region). Chipster's way of running Bowtie with the --best and --strata options has the following consequences: Bowtie is guaranteed to report the best alignment(s) as opposed to reporting the first alignment found at random. If you allowed the read to have mismatches, only the alignments with least mismatches are reported. Note that if you allow the read to have only 1 best category hit, it might still have other alignments which contain more mismatches (and are hence in lower categories).

If a read has a higher number of reportable alignments than what was allowed, the user can request these multireads to be put in a separate fastq file for further inspection. Similarly, the user can request that reads which don't align are put in a separate fastq file.

After running Bowtie, Chipster converts the alignment file to BAM format, and sorts and indexes it using the SAMtools package.

Output

This tool returns the alignment in BAM format and an index file for it (.bai). It also produces a log file, which allows you to see what percentage of the reads align with the selected parameter settings. Optionally also fastq files are produced for the unaligned reads, or reads that align to multiple locations.

Reference

Langmead et al. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome (2009) Genome Biology 10:R25