Analysis tools and visualizations
Next generation sequencing (NGS):
Chipster genome browser
- View NGS reads in their genomic context using Ensembl annotations
- Zoom in to nucleotide level
- View automatically calculated coverage (total or strand-specific) as line graph or density graph
- Highlight SNPs
- Use BED, VCF and GTF files to jump from one location to another
Quality control
- FastX
- FastQC
- PRINSEQ
- RseQC
- MultiQC
Utilities
- SAMtools
- BEDtools
- FastX
- PRINSEQ
- TagCleaner
- Trimmomatic
- Picard
Alignment
- STAR
- HISAT2
- Minimap
- Bowtie2
- Bowtie
- BWA
- TopHat
RNA-seq
- Count reads per gene with HTSeq
- Count reads per transcripts with eXpress
- Differential expression with edgeR, DESeq2, DEXSeq and Cuffdiff
- Assemble transcripts with Cufflinks
Single cell RNA-seq
miRNA-seq
- Differential expression with edgeR and DESeq2
- Pathway analysis for miRNA target genes
- Correlate with gene expression
Community analysis of amplicon sequencing data (16S rRNA)
Virus detection using small RNA-seq data
Variants
- SAMtools
- bcftools
- VCFtools
- Annotation with Bioconductor
ChIP-seq and FAIRE-seq
- Detect peaks using MACS
- Detect peaks using F-seq
- Filter peaks based on p-value, no of reads, etc
- Find common sequence motifs and scan them against JASPAR
- Find common sequence motifs with Dimont
- Search sequences with a motif
- Retrieve genes nearest to the peaks
- Filter genes based on peak location and distance
- GO enrichment for the nearby genes
MeDIP-seq
- Methylated regions with MEDIPS
CNA-seq
- Count reads in bins, segment, and call copy number aberrations
- Plot copy number profiles
- Identify common regions
- Test for DNA copy number induced differential expression
- Plot combined profiles of copy number and expression
- Plot copy number -induced gene expression
- Add cytogenetic bands
- Count overlapping CNVs by comparing to Database of Genomic variants
Genomic region manipulations
- BEDTools
- In house tools to find, fuse or remove overlapping regions
- In house tool to combine region files
Microarrays and proteomics:
Normalization
- Affymetrix 3' expression
- mas5
- plier
- rma
- gcrma
- Li-Wong (dChip)
- vsn with mas5 and plier
- Affymetrix exon arrays
- Affymetrix SNP arrays
- cDNA / Agilent
- Background Correction
- none
- subtract
- Edwards
- normexp
- Within-chip
- Within-gene
- none
- scale
- quantile
- aquantile
- vsn
- Illumina
- none
- median
- quantile
- vsn
- rsn
- loess
- Removal of batch effects
Missing values
- Imputation
- Mean
- Median
- K-nearest neighbor
- Removal
Quality-control
- Affymetrix
- Average background
- Scaling factor
- 3'/5' ratio of qc-probesets
- RNA degradation plot
- RLE plot
- NUSE plot
- cDNA
- histogram
- density plot
- MA plot
- Illumina
Filtering
- Flags (P/M/A)
- Expression value
- Standard deviation
- Coefficient of variation
- Interquartile range
- Free filtering by values in a column
- Descriptive statistics
- Fold change
- P-value
Search
- Similarly expressed genes by correlation
- Genes by common name
- Genes by identifier (Affymetrix ID, Agilent ProbeName, etc.)
- Chromosome location
Plot
- Volcano plot (GUI)
- Venn diagram (GUI)
- Histogram (GUI)
- Line graph, i.e., expression profile (GUI)
- 2D Scatter plot (GUI)
- 3D Scatter plot (GUI)
- Chip image (GUI)
- K-means clustering (GUI)
- Hierarchical clustering (GUI)
- Self-organizing maps (GUI)
- Chromosomal location of genes
- Chromosome specific idiogram
- RNA degradation plot
- Density plot
- MA plot
- RLE plot
- NUSE plot
- Boxplot
- Correlogram
- Gene set enrichment analysis
- Heatmap
- Annotated dendrogram
Annotate
- Annotate a gene list using EntrezGene, RefSeq, GO, KEGG, PubMed, and UniGene
- Annotate miRNA (target genes from miRBase, miRtarget2, PicTar, TarBase and TargetScan)
Statistical testing
- Single slide
- Noise (sd) envelope
- Newton method
- One group
- T-test
- Wilcoxon rank sum test
- Two Groups
- Empirical Bayes
- T-test
- F-test
- Mann-Whitney U test
- SAM
- LPE
- ROTS
- Multiple groups
- Empirical Bayes
- ANOVA
- Kruskall-Wallis test
- SAM
- Linear modeling
- A maximum of three main effects and their interactions
- Technical replication (one level)
- Pairing (one level)
- Time series
- Periodically expressed genes
- Independent component analysis
- Association analysis for SNP data
- Checking the Hardy-Weinberg equilibrium
- Testing the genotype association
- Testing the dominant model for inheritance
- Testing the recessive model for inheritance
- Multiple testing correction
- None
- Bonferroni
- Holm
- Hochberg
- Benjamini and Hochberg
- Benjamini and Yakutieli
- Experimental design
- Estimation of sample size
- Estimation of power
- Estimation of fold change
- Correlation with phenodata
- Correlation of miRNA with target gene expression
Clustering
- K-means
- Hierarchical
- Distances
- Euclidian
- Manhattan
- Pearson
- Spearman
- Methods
- Single linkage
- Average linkage
- Complete linkage
- Ward
- Resampling testing of the result
- Self-organizing map (SOM)
- Quality threshold (QT) clustering
Ordination
- Principal component analysis (PCA)
- Non-metric multidimensional scaling (NMDS)
- Detrended correspondence analysis (DCA)
Classification
- K-nearest neighbor
- Cross-validation
- Prediction of test group membership
- Discriminant analysis
- Neural nets
- Support vector machines
- Naive Bayes
Pathway analysis
- Gene set (globaltest)
- Own gene list
- Groups inferred from KEGG
- Groups inferred from GO
- Enrichment analysis of GO terms in a list of genes
- Hypergeometric test for over- or underrepresentation
- GO categories
- KEGG pathways
- ConsensusPathDB
- PFAM categories
- Cytobands
- GO categories for miRNA targets
- KEGG pathways for miRNA targets
- SAFE
- Mapping to Reactome pathways
- Mapping to protein interactions from IntAct
Promoter analysis
- Retrieve promoters from UCSC genome database
- Weeder (finds sequence motifs common to a set of promoters)
- Cosmo (finds sequence motifs common to a set of promoters)
- ClusterBuster (finds clusters of putative TF binding sites using JASPAR matrices)
aCGH analysis
- Import from CanGEM database
- Call copy number aberrations
- Plot copy number profiles
- Identify common regions
- Test for DNA copy number induced differential expression
- Plot combined profiles of copy number and expression
- Plot copy number -induced gene expression
- Fetch probe positions from CanGEM
- Add cytogenetic bands
- Count overlapping CNVs by comparing to Database of Genomic variants
- Sample size calculations with an adapted BH method
Import / Export
- Export and import in SOFT format for GEO database
- Export in tab2mage format for ArrayExpress database
- Import from ArrayExpress database
- Import from GEO database