Heatmap for RNA-seq

Description

Draw a heatmap and dendrogram of the genes of interest. As input, give the list of genes of interest and the original count file containing RAW COUNTS for all genes in all samples. The tool will first perform a normalisation / transformation for the values.

Parameters

Transformation method (variance stabilizing transformation, regularized log transformation, no transformation - only DESeq2 normalization) [variance stabilizing transformation]
Annotation column (Phenodata column used for plotting the sample names
Represent genes with (gene IDs, symbols) [gene IDs]
Image width
Image height

Details

This tool takes as input a table of raw counts and a list of genes of interest (.tsv format). The count table has to be associated with a phenodata file describing the experimental groups. These files are best created by the tool "Utilities / Define NGS experiment", which combines count files for different samples to one table, and creates a phenodata file for it. The list of genes of interest can be generated for example with the differential expression tools, such as DESeq2 or edgeR and filtering tool (Utilities / Filter table by column value). You might want to limit the list to tens of genes.

The tool first transforms the raw read counts using the DESeq2 Bioconductor package. The input file has to contain all the genes, not just differentially expressed ones. Note that you can use the resulting transformed values only for visualization and clustering, not for differential expression analysis which needs raw counts.
For more information about the transformation part, please study the manual for Transform read counts tool.
This information is then merged with the list of genes of interest, and then the heatmap is drawn.

You can tune the size of the heatmap with the image size parameters. The heatmap scales accordingly.
You can also choose the naming for the samples from the phenodata columns: usually it is wise to use some short names in the description column and use that.
Genes are shown as the rownames in the heatmap. By default, the the gene IDs are taken from the rownames of the input file (list of genes of interest). If you wish to use gene symbols/names instead, make sure they are available as column in the input file, and select the parameter accordingly. Note that if your gene IDs are Ensembl IDs, you can create the symbol column for the input table with the tool Utilities / Annotate Ensembl identifiers.

Output

heatmap.pdf
transformed-counts.tsv

References

This tool uses the DESeq2 package. Please read the following article for more detailed information:

M Love, W Huber and S Anders: Moderated estimation of fold change and dispersion for RNA-Seq data with DESeq2. Genome Biol. 2014 15:550