Quality control / Check FASTQ file for errors

Description

Checks a FASTQ file for common errors.

Parameters

None

Details

The tool does some simple tests to validate a FASTQC file:

  1. Check that the number of lines in file is divisible by four
  2. For each read check that the number of sequence letters and the number of quality values match
  3. Check for duplicated sequence identifier lines
Tests 1 and 2 are mainly meant to spot files that are truncated in transfer or otherwise. Test 3 can spot instances where FASTQ files have been concatenated, and one or more files have been added more than once by mistake.

Please note that the tool is not guaranteed to find all errors. It does not, for example, check for non-permitted characters in sequences and quality values. It is intended to be a simple and relatively fast tool to spot some common problems that can occur in file transfer or manipulation.

For more in-depth analysis use one the more comprehensive quality checking tools such as "Read quality with FastQC".

Output

A log file with details of any found problems is generated for each FASTQ file. Log files for files with no problems are named *_PASS.log. If any of the tests fails, the file is named *_FAIL.log.

References

For details on FASTQ file format see e.g. the Wikipedia FASTQ entry.