Chipster training courses
We run several bioinformatics courses on different topics every year in Finland and abroad. The courses at CSC
are open for everybody,
but you can also contact us to discuss options for hosting a course on your site.
We make our course materials publicly available so that anyone can download them for their own use.
The materials include slides and exercises, and many courses have also lecture videos.
The exercise data are available as example sessions on the Chipster server, and
we also provide ready-made analysis sessions which you can use as a reference when doing exercises on your own.
We also provide training accounts to our Chipster server in Finland.
Note, that users from Finnish universities and research institutes can use their HAKA and VIRTU logins to access
Chipster, so no training account is needed.
However, we kindly ask you to inform us about upcoming Chipster courses
(number of participants, type of analysis jobs) so that we can add computing resources to the server if needed.
We run currently the following Chipster courses (follow the links for more info). CSC runs also other bioinformatics courses,
you can find the upcoming courses in our course
calendar.
- RNA-seq data analysis
- Single cell RNA-seq data analysis
- Virus detection using small RNA-seq
- Microbial community analysis of amplicon sequencing data (16S)
- Detection and annotation of genomic variants
- ChIP-seq data analysis
- Microarray data analysis
This course introduces RNA-seq data analysis methods, tools and file formats. It covers all the steps from quality
control and alignment to quantification and differential expression analysis, and also experimental design is
discussed.
The course takes two days (or one long day if you omit exercise sheets 3 and 4). You will learn how to
- check the quality of reads with FastQC and MultiQC
- remove bad quality data with Trimmomatic
- infer strandedness with RseQC
- align reads to the reference genome with HISAT2 and STAR
- perform alignment level quality control using RseQC and SAMtools
- quantify expression by counting reads per genes using HTSeq
- check the experiment level quality with PCA plots and heatmaps
- analyze differential expression with DESeq2 and edgeR
- take multiple factors (including batch effects) into account in differential expression analysis
Course material (2020):
There are two courses for single cell RNA-seq data analysis. The more recent one focuses on the analysis of 10X data
starting from the digital gene expression matrix (DGE),
while the older one also covers the preprocessing of DropSeq data from raw reads to a DGE. Both courses show how to
find sub-populations of cells using clustering with the Seurat tools, but the older course uses Seurat v2 instead of
v3.
You will also learn how to compare two samples and detect conserved cluster markers and differentially expressed
genes in them. The course takes one day.
Course 1 (October 2020)
You will learn how to
- create Seurat v3 object
- perform QC and filter out low quality cells
- normalize expression values
- detect highly variable genes using VST
- scale data and regress out unwanted variability
- perform principle component analysis (PCA) and select PCs to be used for clustering
- cluster cells and find marker genes for a cluster
- visualize clusters with UMAP and tSNE
- run canonical correlation analysis (CCA) to identify common sources of variation between two datasets
- integrate samples using the mutual nearest neighbor approach (anchors)
- find conserved cluster marker genes for two samples
- find differentially expressed genes in a cluster between two samples
- visualize genes with cell type specific responses in two samples
Course material:
Course 2 (March 2019)
You will learn how to
- check the quality of reads with FastQC
- tag reads with cell and molecular barcodes
- trim and filter reads
- align reads to the reference genome with HISAT2 and STAR
- tag reads with gene names
- visualize aligned reads in genomic context using the Chipster genome browser
- estimate the number of usable cells by checking the inflection point
- detect bead synthesis errors
- create and filter DGE
- create Seurat v2 object
- regress out unwanted variability
- detect variable genes and perform principle component analysis
- cluster cells and find marker genes for a cluster
- run canonical correlation analysis (CCA) to identify common sources of variation between two datasets
- align two samples for integrated analysis
- find conserved cluster markers for two samples
- find differentially expressed genes in a cluster between two samples
- visualize genes with cell type specific responses in two samples
Course material:
This course introduces the VirusDetect pipeline covering all the analysis steps and file formats. VirusDetect allows
you to detect known viruses and identify news ones by sequencing small RNAs (siRNA) in host samples. siRNA sequences
are assembled to contigs and compared to known virus sequences.
The course takes about 5 hours. You will learn how to
- run VirusDetect and interpret the result files
- subtract reads originating from the host genome
- set parameters for filtering the BLAST matches
Course material (2018):
This course introduces microbial community analysis of (16S rRNA) amplicon sequencing data. It covers preprocessing,
alignment to reference, taxonomic classification, and statistical analysis.
The course takes one day. You will learn how to
- check the quality of reads with FastQC and MultiQC
- remove bad quality data with Trimmomatic and Mothur
- remove redudancy, perform alignment and preclustering with Mothur
- do taxonomic classification with Mothur
- perform statistical analysis (based on R packages vegan, rich, biodiversityR, pegas, labdsv and DESeq2)
Course material (2020):
- lecture videos
- slides
- exercises.
The data is available on Chipster server in the example session listed in the exercise sheet.
This course covers variant analysis from raw sequence reads to variant annotation, introducing the theory, analysis
tools and file formats involved.
The course takes one day. You will learn how to
- check the quality control with FastQC and PRINSEQ
- remove bad quality data with Trimmomatic
- align reads to the reference genome with BWA
- perform alignment level quality control with SAMtools
- mark duplicates with Picard
- call and filter variants with Samtools, BCFtools and VCFtools. Note that the GATK variant calling platform
will be integrated in Chipster in 2019.
- annotate variants with Variant Effect Predictor (VEP)
- visualize variants in genomic context with the Chipster genome browser
Course material (2016):
This course covers ChIP-seq analysis from quality control and alignment to peak calling, motif detection, and
pathway analysis. It introduces the theory, analysis tools and file formats involved.
The course takes one day. You will learn how to
- check the quality control with FastQC
- align reads to the reference genome with Bowtie
- call peaks with MACS2
- filter peaks
- visualize reads and peaks in genomic context with the Chipster genome browser
- retrieve nearby genes
- pathway analysis
- motif detection
Course material (2016):
- slides
- exercises.
The data is available on Chipster server in the example session listed in the exercise sheet.
This course covers microarray data analysis from quality control and normalization to differential expression
analysis, clustering and pathway analysis. It introduces the theory, analysis tools and file formats involved.
The course takes one and half day. You will learn how to
- perform quality control
- normalize data
- find differentially expressed genes and take batch effects into acount
- perform clustering
- visualize results in different ways
- pathway analysis
Course material (2018):