Preprocessing / Trim reads with Trimmomatic
Description
This tool performs a variety of trimming tasks for Illumina paired end and single end data.
Parameters
- Adapter set (none, TruSeq2-SE.fa, TruSeq3-SE.fa, TruSeq2-PE.fa, TruSeq3-PE.fa, TruSeq3-PE-2.fa, NexteraPE-PE.fa) [none]
- Adapter clipping parameters (String) []
- Quality scale used in the fastq file (phred33, phred64) [phred33]
- Minumum quality to keep a leading base (Integer) []
- Minumum quality to keep a trailing base (Integer) []
- Number of bases to keep from the start (Integer) []
- Number of bases to remove from the start (Integer) []
- Sliding window trimming parameters (String) []
- Adaptive quality trimming parameters (String) []
- Minimum average quality of reads to keep (Integer) []
- Minimum length of reads to keep (Integer) []
- Write a log file (yes, no) [yes]
Details
Only the trimming steps with user specified parameters are performed. The steps with empty parameters are skipped. The steps are performed in following order:
- Adapter clipping (ILLUMINACLIP)
- Trim leading bases by quality (LEADING)
- Trim trailing bases by quality (TRAILING)
- Number of bases to keep from the start (CROP)
- Number of bases to remove from the start (HEADCROP)
- Sliding window trimming (SLIDINGWINDOW)
- Adaptive quality trimming (MAXINFO)
- Minimum average quality of reads to keep (AVGQUAL)
- Minimum length of reads to keep (MINLEN)
Adapter clipping is best performed first, as other clipping functions may remove parts of adapter sequence and thus make adapters more
difficult to find. You can use the adapter sets available in Chipster (the Trimmomatic basic set of adapters) or your own adapter file in .fa -format.
Minimum length filtering should be done last, or additional clipping may produce reads shorter than the minimum
length specified. Other steps are less sensitive to the order they are performed, but if you wish to run the steps in a specific order
that differs from the default order, you have to run each step separately.
For details on each trimming step and their parameters see the Trimmomatic manual.
The tool is based on the Trimmomatic package.
Output
The trimmed reads are in gzipped fastq format. Output files depend on whether the trimmimng was done on single end or paired end reads.
For single end reads:
- trimmed.fq.gz: Trimmed reads
For paired end reads:
- trimmed_reads1_paired.fq.gz: Trimmed reads from the first input file with a mate pair in the second input file.
- trimmed_reads1_unpaired.fq.gz: Trimmed reads from the first input file without a mate pair in the second input file.
- trimmed_reads2_paired.fq.gz: Trimmed reads from the second input file with a mate pair in the first input file.
- trimmed_reads2_unpaired.fq.gz: Trimmed reads from the second input file without a mate pair in the first input
Optionally:
- trimlog.txt: Trimming log file
References
This tool uses the the Trimmomatic package. Please cite the article:
Bolger, A. M., Lohse, M., & Usadel, B. (2014). Trimmomatic: A flexible trimmer for Illumina Sequence Data. Bioinformatics, btu170.
Please see the Trimmomatic homepage for more details.