BWA MEM for single or paired end reads
This tool aligns single end reads or paired-end reads to selected reference genome using the BWA MEM algorithm.
The reads have to be supplied in fastq format. If just one reads file is defined then single-end analysis is performed.
If two reads files are defined, then paired end analysis is performed.
- Organism Genome that you would like to align your reads against.
- Minimum seed length Matches shorter than this will be missed
when looking for maximal exact matches or MEMs in the first alignment phase.(BWA MEM option -k)
- Maximum gap length Gaps longer than this will not be found. Note also that scoring matrix and hit
length affect the maximum gap length, in addition to this band width parameter.(BWA MEM option -w)
- Match score Score for a matching base.(BWA MEM option -A)
- Mismatch penalty Penalty for a mismatching base (BWA MEM option -B).
- Gap opening penalty Gap opening penalty (BWA MEM option -O).
- Gap extension penalty Gap extension penalty (BWA MEM option -E).
- Penalty for end clipping Penalty for 5'- and 3'-end clipping. When performing the Smith-Waterman
extension of the seed alignments, BWA-MEM keeps track of the best score reaching the end of the read.
If this score is larger than the best SW score minus the clipping penalty, clipping will not be applied (BWA MEM option -L).
Read group parameters
- Read group identifier If you want to add the read group line into the BAM file,
you have to define read group identifier (DI:value).
- Sample name for read group The name of the sample sequenced in this read group (SM:value).
- Platform for read group With this setting you can platform or technology used to produce
the read. Options: ILLUMINA, SOLID, LS454, HELICOS, PACBIO.(PT:value)
- Library identifier for read group DNA preparation library identifier. The Mark Duplicates
tool uses this field to determine which read groups might contain molecular duplicates, in case the
same DNA library was sequenced on multiple lanes.(LB:value).
More information: BWA manual
It is possible to give the tool more than one FASTQ file/file pair. The tool will run the alignment for each
file/file pair separately, and finally merge the resulting BAM files.
If you provide two FASTQ files, the tool will by default perform a paired-end alignment with them. It will try assign
R1 and R2 reads correctly by file name.
If you have more than two FASTQ files (or wish to perform single-end alignment for two files), you will
need to provide a list of filenames of the FASTQ files; one list for single-end alignment, and two for
paired-end alignment (one file for read1 files, and another one for the read2 files) as a text file
(e.g.R1files.txt and R2files.txt). These lists can be generated with the tool
Utilities / Make a list of file names .
The read pairs must be ordered identically in both lists.
To run, Select the list file/files (R1files.txt and R2files.txt) and ALL FASTQ files, and assign the list files correctly.
When assigning the list files, they are automatically inactivated in the "reads" file list.
As a result the tool returns a sorted and indexed BAM-formatted alignment, which is ready for viewing in the Chipster genome browser.