Filtering / Filter reads for adapters, length and Ns
Description
Given a FASTQ file, this tool filters reads based on whether they contain the given adapter sequence.
Parameters
- Adapter sequence that is used for filtering and that is subsequently removed (...) [CCTTAAGG]
- Minimum adapter alignment length (0-) [0]
- Discard sequences shorter than (1-) [15]
- Quality value format used (Sanger, Illumina GA v1.3-1.5) [Sanger]
- Discard sequences with Ns (yes, no) [yes]
- Output options (Keep only clipped reads, Keep only non-clipped reads, Keep both clipped and non-clipped reads) [Keep only clipped reads]
Details
Adapters are clipped from the 3' end. If the read is too short after clipping, it is discarded. Also reads which contain unknown nucleotides (Ns) are
optionally discarded.
Output
A FASTQ file containing reads which contained the adapter sequence, which are long enough, and which don't contain Ns.
In addition a log file is produced, telling how many reads the input file contained and what percentage of them were discarded.
Reference
This tool is based on the FASTA/Q Clipper tool of the FASTX package.