VirusDetect with own genome

Description

This tool runs the VirusDetect pipeline, that performs virus identification using small RNA (sRNA) sequencing data. Given a FASTQ file, it performs de novo assembly and reference-guided assembly by aligning sRNA reads to the reference database of known viruses. The assembled contigs are compared to the reference virus sequences for virus identification first using BLASTN and then BLASTX. Virus assignments are selected based on the three cutoff parameters described below.

More detailed definition of Virus detect pipeline can be found at the home page of VirusDetect.

If possible, the reads should be cleaned from sequences originating from the host genome. This can be done by mapping the reads against the host genome and selecting only those reads that did not match. Also the resulting contigs are aligned to the host genome and the matching ones are removed. If your host genome is not available in Chipster but you have the host genome sequence as a fasta file, you can use this tool to automatically calculate the required BWA indexes and perform the host genome filtering for your data. Please note that it may take several hours to calculate BWA indexes for a large genome. When the VirusDetect analysis is finished, the BWA indexes of the host genome are returned as one tar-formatted archive file. This archive file can be used as an input file, instead of the fasta formatted genome, for the subsequent VirusDetect and BWA jobs so that you don't have to repeat the time consuming indexing process.

Input Files

This tool requires two input files:

Parameters

Output

VirusDetect produces a large number of result files. Output related options are used to select, what data is returned. By default VirusDetect returns the following files: If the parameter Return matching reference sequences is turned on, also the following files are returned If the parameter Return BAM formatted alignments is turned on, also the following files are returned

Note: If you select the blastn_matching_references.fa and blastn_matches.bam, you can use the Chipster Genome Browser to visualize the BLAST results. In the Genome Browser the blastn_matching_references.fa is used as the genome and each reference virus sequence is listed in the Chromosome pull down menu.