Sequence similarity search with DIAMOND

Description

DIAMOND is sequence similarity search tool for protein sequences. It performs a BLAST like sequence database search that is nearly as sensitive as BLAST but significantly faster. In Chipster this tool can be use to search matching sequences from SwissProt, TrEMBL and NR databases or from a protein sequence file provided by the user.

Parameters

  • Maximum number of hits Maximum number of hits to report for one query sequence.
  • e-value e-value limit to select significant hits
  • Output format Options: BLAST pairwise, BLAST XML, BLAST tabular, DIAMOND alignment archive, SAM
  • Report unaligned sequences Create a sequence file containing those query sequences that did not had any matches.
  • Search mode The Fast mode was mainly designed for short reads. For longer sequences, sensitive or more-sensitive modes are recommended.
  • Matrix Scoring matrix to be used
  • Collect a log file
  • Details

  • The runtime of the program is not linear in the size of the query file and it is much more efficient for large query files than for smaller ones.
  • Low complexity masking is applied to the query and reference sequences by default. Masked residues appear in the output as X.
  • More detailed description about DIAMOND search tool can be found form the DIAMOND github page.