Chipster manual

Version update 24.1.2020: What is new in Chipster 3.16.3

Summary: This is a patch due to a bug in the RNA-seq aligner HISAT2. The strandedness parameter of the HISAT2 tools was reversed, which caused the XS tags in the BAM file to be the wrong way round (+ instead of -, and vica versa). The XS tags are used by the older tools Cufflinks and Cuffdiff when they assemble transcripts and count reads per genes/transcripts. Note that counting reads per genes by HTSeq is not affected by XS tags, because it uses its own method for strandedness information. For those people who would like to rerun their Cufflinks/Cuffdiff analysis, we have made a conversion tool Reverse XS tags in BAM which you can run on the old HISAT2 BAM files (this is faster than running the aligner again).

NOTE: We strongly encourage you to try the new Chipster Web app, which will replace the Java based Chipster later this year. The Web app does not require Java (many universities do not provide Java anymore because Oracle's Java license policy changed). Users from Finnish universities can access the Chipster Web app with their HAKA (university) account. Note that the analysis session format is different, so your sessions are not automatically transferred. However, we made a conversion system which allows you to upload old sessions from your laptop to the new Chipster.

New analysis tools
- RNA-seq / Reverse XS tags in BAM reverses the strandedness tags in BAM file.
Improvements to analysis tools
- Alignment / HISAT2 for single end reads: Strandedness parameter updated to use the options R and F.
- Alignment / HISAT2 for single end reads and own genome: Strandedness parameter updated to use the options R and F.
- Alignment / HISAT2 for paired end reads: Strandedness parameter updated to use the options RF and FR.
- Alignment / HISAT2 for paired end reads and own genome: Strandedness parameter updated to use the options RF and FR.
- Quality control / RNA-seq strandedness inference and inner distance estimation using RseQC: Output file now lists strandedness parameter values R, F, RF and FR for HISAT2.

Version update 8.10.2019: What is new in Chipster 3.16

Summary: This version brings major improvements to single cell RNA-seq data analysis, because the single cell analysis tools have been updated to Seurat v3 and R3.6.1.

We would also like to encourage you to try our new web interface to Chipster, which does not require Java (many universities do not provide Java anymore because Oracle's Java license policy changed). Users from Finnish universities can access the web based Chipster with their HAKA (university) account. Note that the analysis session format is different, so your sessions are not automatically transferred. However, we made a conversion system which allows you to upload old sessions from your laptop to the new Chipster.

New analysis tools
- Single cell RNA-seq / Seurat v3 BETA -Extract cells in a cluster retrieves cells for a given cluster.
- RNA-seq / Heatmap for RNA-seq results performs VST transformation and produces a heatmap.
Improvements to analysis tools
- Single cell RNA-seq / Seurat v3 -Filter cells, normalize, regress and detect variable genes detects highly variable genes using VST (variance stabilizing transformation).
- Single cell RNA-seq / Seurat v3 -PCA allows you to specify how many PCs to plot.
- Single cell RNA-seq / Seurat v3 -Clustering and detection of cluster marker genes produces now a UMAP plot in addition to tSNE, and returns only positive markers.
- Single cell RNA-seq / Seurat v3 -Visualize genes produces a UMAP plot.
- Single cell RNA-seq / Seurat v3 -Visualise features in UMAP plot allows to color a UMAP plot based on gene expression or QC data such as the mitochondrial transcript percentage.
- Single cell RNA-seq / Seurat -Combine two samples combines samples using a new approach: It performs CCA and L2 normalization to bring the samples in shared spaces, and then looks for mutual nearest neighbors. These anchors are scored based on neighborhood in the PC space, and correction vectors are calculated based on anchors and scores.
- Single cell RNA-seq / Seurat v3 -Integrated analysis of two samples has an option to use UMAP plot to visualize clusters.
- Single cell RNA-seq / Seurat v3 - Find conserved cluster markers and DE genes in two samples returns only positive markers.
- Single cell RNA-seq / Seurat v3 -Visualize genes with cell type specific responses in two samples produces UMAP and violin plots.
- Utilities / Retrieve datasets from ENA database can now handle paired end data.
- Misc / Sequence utilities / Display basic information about sequences can now handle up to 300 000 sequences.

Version update 9.4.2019: What is new in Chipster 3.15

Summary: This version contains new tools for calling somatic variants (SNPs and INDELs) using the GATK4 Mutect2 pipeline. It also has tools for retrieving data from Illumina BaseSpace and ENA. Many single cell RNA-seq data analysis tools have been improved. All the reference genomes have been updated to Ensembl v95, and genome indexes for new organisms have been added in the STAR and HISAT2 aligners.

New analysis tools
- Variants / GATK4 -Call somatic SNVs and INDELs with Mutect2 calls somatic short variants via local assembly of haplotypes.
- Variants / GATK4 -Create Somatic Panel of Normals creates a panel of normals (PoN) containing germline and artifactual variant sites. Mutect2 then uses the PoN to filter variants at the site-level.
- Variants / GATK4 -Tabulate pileup metrics for inferring contamination summarizes counts of reads that support reference, alternate and other alleles for given sites.
- Variants / GATK4 -Estimate contamination calculates the fraction of reads coming from cross-sample contamination, given results from GetPileupSummaries. The resulting contamination table is used with FilterMutectCalls.
- Variants / GATK4 -Filter Mutect2 calls filters somatic variants in a Mutect2 VCF callset.
- Variants / GATK4 -Generate sequencing artifact metrics for filtering by orientation bias quantifies substitution errors caused by mismatched base pairings during various stages of sample / library prep.
- Variants / GATK4 -Filter Mutect2 calls by orientation bias filters Mutect2 somatic variants for sequence context-dependent artifacts, e.g. OxoG or FFPE deamination.
- Single cell RNA-seq / Seurat -Visualise features in tSNE plot colors cells on a tSNE dimensional reduction plot according to a feature, i.e. gene expression, PC scores, number of genes detected, etc.
- Data retrieval / Retrieve data from Illumina BaseSpace This tool requires that you have an access token for the bs client program. Please see the manual for how to obtain it.
- Data retrieval / Retrieve datasets from ENA retrieves data from the ENA database based on the entry ID or name.
- Utilities / Share a file generates a public url for a file using the Object Storage service of Chipster so that you can share individual result files.
- Sequence similarity search / DIAMOND protein sequence similarity search, faster than BLAST.
Improvements to analysis tools
- Single cell RNA-seq / Merge aligned and unaligned BAM tags reads with gene names so a separate tagging tool is not needed any more.
- Single cell RNA-seq / Seurat -Setup and QC indicates the number of cells.
- Single cell RNA-seq / Seurat -Clustering reports the number of cells in each cluster, produces a heatmap, and has a parameter for regulating the point size in tSNE plots.
- Single cell RNA-seq / Seurat -Combine two samples and perform CCA has parameters for the number of CCs to compute and visualize.
- Single cell RNA-seq / Seurat -Integrated analysis of two samples reports the number of cells in each cluster and has a parameter for regulating the point size in tSNE plots.
- 16S rRNA sequencing / Classify sequences to taxonomic units allows you to tick boxes in order to remove lineages.
- 16S rRNA sequencing / Produce count table and phenodata uses now a fixed seed for rarefaction in order to give consistant results. Species with zero counts are not reported.
- HISAT2 aligners have reference genome index for rat.
- STAR aligners have reference genome indexes for rat and mouse.
- Bowtie, Bowtie2 and BWA aligners have miRBase reference indexes for human, mouse and rat.
Updates to reference data
- All reference genomes have been updated to Ensembl v95.
- VirusDetect database has been updated to v227.
Improvements to user interface
- Import tool allows you to copy the same function (Use Import tool or Import directly) to all files.

Version update 13.9.2018: What is new in Chipster 3.14

Summary: This version contains major improvements in the community analysis tools for amplicon sequencing data (16S rRNA), and it also makes quality control easier for multisample datasets. The example session NGS_16S_rRNA_community_analysis_MiSeqData demonstrates how these tools can be used.

New analysis tools
- Quality control / Read quality with MultiQC for many FASTQ files Given a Tar package of FASTQ files, this tools runs FastQC for all of them and combines the results in one report using the MultiQC tool. Note that you can package your FASTQ files in a Tar package using the tool Utilities / Make a Tar package.
- Quality control / Combine reports using MultiQC. Allows you to combine several quality reports in one using MultiQC. This tool takes currently mqc files produced by FastQC as input, support for other tools will be added.
- 16S rRNA sequencing / Convert contig names to Mothur format Combining paired end MiSeq reads to contigs using CLC Genomics Workbench produces contig names, which can be converted to Mothur format with this tool.
Improvements to analysis tools
- 16S rRNA sequencing / Produce count table and phenodata has been improved in many ways:
  - This tool used to count only unique sequences, now it uses Mothur's count_table file to count all sequences. If you have used to this tool after running the tool Extract unique sequences, we strongly recommend that you do your analysis again.
  - Additional output file counttable_transposed.tsv has been added. This file is suitable for the RNA-seq tools DESeq2 and edgeR in Chipster, allowing you to use their recommended normalization methods for microbiome data. Note that you can also use it for the tool Quality control / PCA and heatmap of samples with DESeq2.
  - New parameter Rarefy counts. Produces a countable where total sequence counts for each sample are equal and the same as in the smallest sample.
  - New parameter Produce binary table instead of counts. Produces a countable where detected species are marked with 1 and undetected with 0, instead of the actual sequence counts. This kind of counttable is typically used for co-occurance studies.
- 16S rRNA sequencing / Align sequences to reference The number of processors to be used has been increased. Silva reference database has been updated to v132 and the option to use bacterial subset has been removed.
- 16S rRNA sequencing / Classify sequences to taxonomic units The number of processors to be used has been increased. Silva reference database has been updated to v132 and the option to use bacterial subset has been removed.
- Quality control / Read quality with FastQC Contains new parameter Create input for MultiQC.
Updates to reference data and analysis packages
- Silva database has been updated to v132.

Version update 13.6.2018: What is new in Chipster 3.13

Summary: This version contains reference data updates and new tools for single cell RNA-seq data analysis and alignment of long reads (e.g. PacBio or Oxford Nanopore). The single cell tools allow for example the identification of common cell types across conditions and comparison of the different conditions. The example session NGS_single_cell_RNAseq_Seurat_integrated_analysis demonstrates how these tools can be used and it follows the original Seurat tutorial.

New analysis tools
- Alignment / Minimap2 for mapping reads to genomes Aligns long (e.g. PacBio or Oxford Nanopore) reads to genome. You can also give your own reference genome is fasta format.
- Single cell RNA-seq / Seurat - Combine two samples and perform CCA Combines Seurat objects of two samples for integrated analysis and performs CCA.
- Single cell RNA-seq / Seurat - Integrated analysis of two samples Aligns the CCA subspaces and performs integrated analysis on the data.
- Single cell RNA-seq / Seurat - Find conserved cluster markers and DE genes in two samples Lists the cell type markers that are conserved across the two conditions, and the differentially expressed genes between the two conditions for a user defined cluster.
- Single cell RNA-seq / Seurat - Visualize genes with cell type specific responses in two samples Plots user defined markers/genes across the conditions.
Improvements to analysis tools
- Single cell RNA-seq / Seurat -Filtering, regression and detection of variable genes You can now regress out cell cycle differences.
- Single cell RNA-seq / Seurat -Setup and QC Improved handling of tar package of 10X Genomics output files.
- BWA aligners with own genome output now the genome index as a tar file and allow you to use it as input in subsequent alignment runs to save time.
Updates to reference genomes and analysis packages
- Reference genomes have been updated to Ensembl v92.
- miRBase has been updated to v22.
- All Seurat-based tools for single cell RNA-seq have been updated to Seurat v2.3 and R v3.4.3.
- TagCleaner has been updated to v0.16.

Version update 25.10.2017: What is new in Chipster 3.12

Summary: The tool categories "Single cell RNA-seq" and "16S rRNA sequencing" have undergone a major overhaul with new tools and improvements based on the feedback kindly provided by the Chipster user community in Finland. You can now analyze DropSeq and Chromium 10X single cell RNA-seq data from raw reads to clustering and marker gene detection based on the DropSeq and Seurat packages. Chipster v3.12 contains also the HISAT2 and STAR aligners for RNA-seq data, which will eventually replace TopHat2.

New analysis tools
- Alignment / HISAT2 for paired end reads Aligns RNA-seq reads to human and mouse genomes, but you can also give your own reference genome is fasta format. The same tool is available also for single end reads.
- Alignment / STAR for paired end reads and human genome Aligns RNA-seq reads to human genome hg38 using the STAR 2-pass method. The same tool is available also for single end reads. Currently only the human genome is offered as a reference due to the size of STAR indeces.
- Single cell RNA-seq / Preprocessing DropSeq FASTQ files Tags the reads with the cellular and molecular barcodes, removes reads where the cell or molecular barcode has low quality bases, trims adapters and polyA tails.
- Single cell RNA-seq / Merge aligned and unaligned BAM Adds the cell and molecular barcode and other tags that were lost during the alignment to the aligned BAM file.
- Single cell RNA-seq / Tag reads with gene names Adds a BAM tag GE to reads when a read overlaps an exon of a gene.
- Single cell RNA-seq / Estimate number of usable cells Extracts the number of reads per cell barcode and draws a cumulative distribution plot.
- Single cell RNA-seq / Create digital gene expression matrix Identifies and corrects bead synthesis errors and extracts digital gene expression values from a BAM file.
- Single cell RNA-seq / Seurat -Setup and QC Constructs a Seurat object from either .tar package of 10X Genomics output files or DGE table from DropSeq.
- Single cell RNA-seq / Seurat -Filtering, regression and detection of variable genes Filters cells, regresses out uninteresting sources of variation, and detects highly variable genes which are needed for PCA.
- Single cell RNA-seq / Seurat -PCA Principal component analysis using highly variable genes.
- Single cell RNA-seq / Seurat -Clustering Clusters cells, does non-linear dimensional reduction tSNE for visualization, and finds marker genes for the clusters.
- Single cell RNA-seq / Seurat -Visualize biomarkers Visualizes selected marker genes with violin and feature plots.
- 16s rRNA sequencing / Split FASTQ file to FASTA and QUAL files Given a FASTQ file, produces a FASTA file and QUAL file using the Mothur tool fastq.info.
- 16s rRNA sequencing / Produce count table and phenodata Generates a count table and a phenodata file, which you can use to assign samples to different experimental groups.
- Quality control / Check FASTQ file for errors Performs validity checks for a FASTQ file to spot some common problems.
- Utilities / Merge FASTQ Merges FASTQ files, optionally in alphabetical order.
- Utilities / List contents of a tar file List the contents of a tar package.
- Utilities / Compress a file with gzip Compresses a file with gzip.
- Utilities / Row count Counts how many lines there are in a txt or tsv file.
Improvements to analysis tools
- Most of the tools in the 16s rRNA sequencing category have been modified in order to work with paired end MiSeq data according to the Mothur MiSeq SOP.
- BWA aligners can now take several FASTQ files as input. For paired end reads you need to provide file name lists for read1 files and read2 files.
- Utilities / Extract .tar or .tar.gz file You can now extract only selected files from a tar package.
- Small RNA-seq / VirusDetect Output files are named based on sample names.
Updates to reference genomes and analysis packages
- Reference data has been updated to Ensembl v90.
- VirusDetect has been updated to v1.7 and its database has been updated v220.
Improvements to user interface
- Spreadsheet visualization: In addition to the number of rows also the number of columns is indicated.
- 3D scatter plot visualization: Group 2 is colored in red instead of green.

Version update 19.12.2016: What is new in Chipster 3.11

Summary: This version contains a new tool category Single cell RNA-seq, and it also contains some updates and improvements to existing tools.

New tool category single cell RNA-seq (BETA) and new example session NGS_single_cell_BETA
- 10 new tools for analyzing single cell RNA-seq data and a corresponding example session have been added. These tools are based on the Picard and Drop-seq packages, and they allow you to perform the required preprocessing steps, quantitate expression, and produce diagnostic plots. Please note that these tools are still under development, so we might make changes to them or even combine several tools at a later stage. Should you have any wishes or comments regarding tools for single cell RNA-seq data, please feel free to contact us.
Improvements and updates to analysis tools
- The following tools have been updated to new version: NCBI BLAST (v2.2.29), Entrez Direct (v5.8), and SRAtoolkit (v2.4.2)
- Quality control / RNA-seq strandedness inference and inner distance estimation using RseQC: Recommendation for TopHat and HTSeq parameters has been added to the report.
- RNA-seq / HTSeq tools: The file htseq-count-info.txt now includes a sum of counted alignments.

Version update 22.11.2016: What is new in Chipster 3.10

Summary: All the reference genomes and many analysis packages have been updated. This version contains also new and improved functionality for RNA-seq data. In particular, stranded RNA-seq data is now better supported and there is also a manual page for stranded RNA-seq data.

Updates to reference genomes and analysis packages
- Reference data has been updated to Ensembl v86.
- R-based analysis tools have been migrated to R3.2.3 and some copy number aberration tools have been migrated to R3.3.2.
- The following command line tools have been updated: Bowtie2 2.2.9, Cufflinks 2.2.1, MACS2 2.1.1.20160309, Mothur 1.36.1, Picard 2.6.0, RSeQC 2.6.4, Tophat 2.1.1
New NGS analysis tools
- Small RNA-seq / VirusDetect pipeline detects viruses using siRNA data.
- Small RNA-seq / VirusDetect with own host genome, as above but with a user-supplied host genome.
- Quality control / RNA-seq strandedness inference and inner distance estimation using RseQC detects the type of strand-specific sequencing used and the inner distance between paired reads.
- RNA-seq / Differential expression using Cuffdiff with replicates allows you to assign several BAM files to each experimental group using filename lists.
- Utilities / Make a list of file names makes file name lists used by Cuffdiff and TopHat tools.
- Metagenomics / Screen sequences with Mothur filters reads for selected criteria.
- Utilities / Make a tar package makes a package of selected files allowing easier export from Chipster.
- Utilities / Concatenate files allows you to combine for example FASTQ files one after the other.
Improvements to NGS analysis tools
- RNA-seq / Merge transcript assemblies with Cuffmerge: Reference GTF and fasta files are now available. The problem with duplicate identifiers has been solved.
- RNA-seq / Differential expression using Cuffdiff: Library type and labels parameter have been added. The direction of fold change calculation has been reversed.
- RNA-seq / Assemble transcripts using Cufflinks: GTF sorting has been added.
- RNA-seq / Count aligned reads per genes with HTSeq: Information about the different strandedness options has been added.
- RNA-seq / Differential expression using DESeq2: Plots and rounding of result columns has been improved for cases when there are more than two experimental groups.
- Alignment / TopHat tools: You can give several FASTQ files per sample as input. For paired end reads you need to make a list of filenames first.
- Quality control / RNA-seq quality metrics with RseQC: BED files are now available in Chipster.
- Variants / Filter variants: Option to keep all INFO fields has been added. INDELs are added by the default in the output.
Improvements to microarray analysis tools
- Statistics / Linear modeling: The output has been simplified and the manual has been clarified.
- Statistics / Two group test: The direction of fold change produced by the RankProd method has been reverted.
- Normalization / Illumina - methylumi pipeline: The ability to analyze Illumina 450k arrays has been removed because the new Bioconductor version doesn't support it.
For admins and developers
- Chipster now uses Ubuntu v16.04.
- Database performance for cloud sessions has been improved.
- All R-based tools are now in folder R. Runtimes.xml has been simplified.

Version update 23.5.2016: What is new in Chipster 3.9.0

Summary:This version mainly contains internal changes to the way how tool scripts are distributed on the server side.

Changes
- Chipster client application now requires Java 1.8 or later
- Tool scripts are moved from computing services to a centralised toolbox service
Improvements to analysis tools
- Merge tables now accepts multiple inputs

Version update 4.5.2016: What is new in Chipster 3.8.1

Summary: This version contains a bug fix for the tool Differential expression using DESeq2.

Bug fixes
- RNA-seq / miRNA-seq / Differential expression using DESeq2 fix comparison direction when there are more than 2 groups

Version update 22.3.2016: What is new in Chipster 3.8

Summary: This version contains improvements to several analysis tools.

General improvements
- Each result file contains information about the Chipster version that produced it.
New NGS analysis tools
- RNA-seq / Transform read counts transforms read counts for visualization purposes using the DESeq2 package.
Improvements to NGS analysis tools
- TopHat tools: If user sets the parameter "Number of mismatches allowed in final alignment" to a value higher than 2, Chipster automatically sets the internal parameter --read-edit-dist parameter to the same value.
- Mothur tools: The summary tables show properly when visualized as spreadsheets.
- Variants / Call SNPs and short INDELs now has a parameter "Ploidy".
- Utilities / Table converter now allows you to define ranges of columns.
Improvements to microarray analysis tools
- Visualization / Correlogram now allows you to choose png output instead of pdf.
- Utilities / Combine probes to genes: the option to produce identifiers has been removed, as it produced confusing results.

Version update 14.1.2016: What is new in Chipster 3.7

Summary: This version enables users to save analysis sessions on the server side. Please note that this cloud session functionality is still at a beta version stage, meaning that it might have unexpected problems. Therefore we strongly advice you to save an extra copy of your session also locally.

Technical improvements
- You can now store analysis sessions on the server. Use the File menu in the upper panel to save, open and manage cloud sessions.
Improvements to NGS analysis tools
- Reference genomes and GTFs have been updated to Ensembl version 83. Rice genome has been added and Yersinia tuberculosis genome has been changed to Yersinia pseudotuberculosis IP32953 GCA_000834295.
- Quality control / PCA and heatmap of samples with DESeq2 now allows visualizing two factors simultaneously in the PCA plot using shapes in addition to colors.
- Preprocessing / Trim reads with Trimmomatic now accepts an input file containing user's own adapter sequences.
- Variants / Filter variants now preserves the AC and DP4 fields in the INFO column of a VCF file when filtering.

Version update 2.10.2015: What is new in Chipster 3.6

Summary: Many NGS tools have been updated to new versions, and reference genomes have been updated to Ensembl version 81. New NGS tools have been added. The job manager component of Chipster server has been re-written in Java in order to get rid of random messaging errors, which caused problems for example when starting the Chipster client software.

New analysis tools
- Quality control / Collect multiple metrics from BAM calculates several quality metrics for BAM using the Picard toolkit.
- Utilities / Mark duplicates in BAM Flags duplicates using the Picard toolkit.
- Utilities / Count alignment statistics for BAM counts statistics for alignments using the SAMtools flagstat command.
- Variants / Intersect VCF files produces 4 VCF files containing variant sites common and unique to each input file.
- Variants / Ensembl variant effect predictor determines the effect of variants on genes, transcripts, and protein sequence, as well as regulatory regions. It uses the Ensembl VEP service running at the EBI.
- Variants / Calculate statistics on VCF file calculates allele frequency and count, LD measure, p-value for HWE, and SNP density.
Improvements to NGS analysis tools
- Several tools have been updated: BEDtools 2.25.0, BWA 0.7.12, FastQC 0.11.3, Picard 1.138, Samtools/bcftools 1.2, Trimmomatic 0.33 and VCFtools 0.1.13. edgeR based tools have been updated to R3.2.0.
- Reference genomes and GTFs have been updated to Ensembl version 81. Potato genome has been added.
- Alignment / BWA MEM has new parameters and the previous tools for SE and PE reads have been combined into one.
- Utilities / Retrieve data for a given organism in Ensembl can bring now GTFs in addition to sequences.
- Utilities / Modify text. Output options for BED and GTF files have been added to enable chromosome naming conversions.
- Variants / Call SNPs and short INDELs uses now the multiallele caller and allows you to control the output fields in VCF. The original sample names are shown in the VCF file.
- Variants / Annotate variants supports now hg38 and provides PolyPhen predictions for variants with rsIDs.
- RNA-seq / Differential expression using edgeR uses now the estimateDisp function to estimate dispersions.
New sequence analysis tools in Misc
- Sequence utilities / Remove redundant sequences clusters the given sequences and creates a non-redundant sequence set using the CDhit tool.
- Sequence alignment / Display sequence alignment or BLAST report as HTML page produces colored HTML alignments using the Mview tool.
Improvements to the user interface
- Interactive 3D scatterplots and PCA plots can be saved as png images.
- Interactive Volcano plot is available for edgeR and DESeq2 results.
Technical improvements
- The job manager component of Chipster server has been re-written in Java in order to get rid of random messaging errors, which caused problems for example when starting the Chipster client software.

Version update 10.7.2015: What is new in Chipster 3.5

Summary: Result files from several NGS analysis tools are automatically named according to the original sample name, saving you the hassle of renaming files. Data can be imported directly to Chipster server from a url.

Improvements to the user interface
- Genome browser allows viewing strand-specific coverage for stranded RNA-seq data based on the XS tag in BAM files.
New NGS analysis tools
- Utilities / Plot normalized counts for a gene. Plots normalized counts for a gene in a group-wise manner using the DESeq2 Bioconductor package.
- Utilities / Download file from a url to server. Allows you to move data and annotation information from a url directly to Chipster server for analysis.
- Utilities / Extract .gz file. Extracts gzipped files. Note that fastq and fasta files don't need to be unzipped for analysis.
- Utilities / Extract .tar.gz file. Extracts tarred and gzipped files. Note that fastq and fasta files don't need to be unzipped for analysis.
Improvements to NGS analysis tools
- Result files produced by PRINSEQ, Tophat, Bowtie2, BWA, Samtools and HTSeq have the original sample name.
- DESeq2-based tools have been migrated to R3.2.0.
- Variants / Annotate variants has been migrated to R3.2.0.
- Quality control / PCA and heatmap of samples with DESeq2 allows adding sample names to the PCA plot.
Improvements to microarray analysis tools
- Normalization / Illumina with normexp background correction accepts now also TargetIDs.

Version update 20.4.2015: What is new in Chipster 3.4

New analysis tools
- Utilities / Modify text. This tool can be used to modify txt and tsv files. For example, you can replace text or extract rows containing a given text.
- Normalization / Illumina with normexp background correction. This tool allows you to use the control probe information and perform background correction using the limma Bioconductor package.
Improvements to NGS analysis tools
- All genomes and GTF files have been updated to Ensembl 79. Medicago truncatula and Populus trichocarpa genomes have been added.
- RNA-seq tools edgeR, DESeq, DESeq2 and RseQC now produce a pdf file, which combines all the plots that were previously in individual files. This improves the clarity of the workflow view.
- RNA-seq / Differential expression using edgeR for multivariate experiments has a new interaction option called nested, which allows you to do comparisons both between and within subjects.
- RNA-seq / Count aligned reads per genes with HTSeq using own GTF now accepts also GTF files, where the tag for gene identifier is ID or GeneID.
- Utilities / Table converter accepts now also tables with rownames (meaning that the first column does not have a title).
Improvements to microarray analysis tools
- Normalization / Illumina now allows you to add the original Illumina annotations to the data, in addition to the ones provided by Bioconductor.

Version update 12.3.2015: What is new in Chipster 3.3

Summary: This version brings a major improvement as you can close Chipster while an analysis job is running, and return to the same session later when the results are ready.

Improvements to the user interface
- You don't need to keep the Chipster client open when running long jobs.
New NGS analysis tools
- Utilities / Annotate Ensembl identifiers.
Improvements to NGS analysis tools
- Differential expression using edgeR does not give an error when there are no differentially expressed genes.
- Input fasta and GTF files can be zipped.
- edgeR tools: Result columns are better described in the manual.
- Manual page for NGS data import has been added.
Improvements to microarray analysis tools
- Linear modeling tools: Phenodata column names which share a word are allowed thanks to Oliver Heil, DKFZ.

Version update 9.2.2015: What is new in Chipster 3.2

New NGS analysis tools
Improvements to NGS analysis tools
- All genomes and GTF files have been updated to Ensembl 78. Cat genome has been added.
- miRBase21 mature miRNA sequences for human, rat and mouse have been added to Bowtie and BWA aligners.
- DESeq2 has been updated to R3.1.2. Textual summary has been added, the MA plot has been updated, and the dispersion fitting parameter has been removed.
- Tophat2 tools: A parameter for library type has been added. The parameter for concordant pairs has been removed due to a bug in Tophat itself. Insertion and deletion BED files have been removed from the output.
- The MeDIP-seq tool now accepts also mouse and rat data in addition to human.
- Read quality with FastQC: The output figures have been combined to one pdf file.
- Trim reads with Trimmomatic: The problem with missing output file for paired end runs has been fixed.
Improvements to microarray analysis tools
- Hierarchical clustering: The dummy image when no bootstrapping is performed has been removed.
- Volcano plot from existing results: The bug with missing negative fold changes has been fixed thanks to Oliver Heil, DKFZ.
- Affymetrix normalization: Support for Focus arrays has been added.
New miscellaneous tools
- Data retrieval / Retrieve sequences from NCBI.
- Utilities / Sort table by column value.
Improvements to the user interface
- The problem in using a Mac mouse with the Genome browser has been fixed.

Version update 6.11.2014: What is new in Chipster 3.1

Summary: This version contains important improvements to the user interface, file transfers, and reference data.

Improvements to the user interface
- Workflow layout is saved so that when you open a session, the files appear in the same positions where you placed them before.
New NGS analysis tools
- Utilities / A5 assembly pipeline for microbial genomes
- ChIP-seq and DNase-seq / Find peaks using MACS2: MACS2 has been made to separate tool and updated to use MACS v2.1.0. Parameters for extension size and broad peak calling have been added.
- CNA-seq / Count the number of aberrations per sample
- CNA-seq / Plot correlations of called copy number data
- CNA-seq / Fuse regions by column value
- CNA-seq / Fuse regions manually
Changes to NGS analysis tools
- All genomes and GTF files have been updated to Ensembl 77. This includes the new human genome assembly, GRCh38.
- BWA-based alignment tools have been updated to version 0.7.10.
- ChIP-seq and DNase-seq / Find peaks using F-seq: Mappability parameter has been added.
- The MeDIP-seq toolhas been made available again after updating the MEDIPS package to R3.
Technical improvements
- SSL support is extended to payload of file transfers, so that not only file metadata (name, ownership, etc.) is transmitted encrypted, but also the content of the file is encrypted. When enabled, payload encryption will have some impact on performance. For more information please see the technical manual

Version update 13.8.2014: What is new in Chipster 3.0

Summary: This is a major update which contains improvements to the user interface functionality and looks. All the NGS analysis tools except variant annotation have been updated to use R3.0.2. Several new NGS analysis tools have been added, and many existing tools have been updated and improved. The virtual machine has been updated to use Ubuntu 12.04.

Improvements to the user interface
- Information on the selected data set is now shown in the visualization panel, which also includes shortcuts to the visualizations and the possibility to rename the file and add notes to it.
- "Batch mode" of running analysis tools: You can launch several identical analysis jobs at the same time, if the analysis tool requires only one input file: Select the files, set the parameters, and click on the "Run for each" button.
- "Batch mode" of running workflows: You can launch a workflow on several input files at the same time: Select the files and "Workflow / Run for each".
- Possibility to visualize user-supplied genomes in the genome browser has been added.
New NGS analysis tools
- RNA-seq / Differential expression using DESeq2: Differential expression analysis using the DESeq2 Bioconductor package.
- RNA-seq / Count reads per transcripts using eXpress: Quantitates expression at transcript level.
- RNA-seq / Count aligned reads per exons for DEXSeq using own GTF: Counts reads per non-overlapping exonic regions using a user-supplied GTF file.
- Preprocessing / Trim reads with Trimmomatic: Performs a variety of read trimming tasks.
- Alignment / TopHat2 for paired end reads and own genome: Aligns paired end reads to user-supplied reference genome.
- Alignment / TopHat2 for single end reads and own genome: Aligns single end reads to user-supplied reference genome.
- Utilities / Retrieve unique alignments from BAM: Retrieves unique alignments from BAM files.
Changes to NGS analysis tools
- RNA-seq / Differential expression using DESeq: GLM support have been added, so you can now analyze experiments containing a second experimental factor like pairing.
- RNA-seq / Count aligned reads per exons for DEXSeq: Reference organisms have been updated and new ones added. Parameters for strandedness and counting mode are included, as well as support for different chromosome naming schemes.
- Alignment / TopHat2 for paired end reads: New parameters to exclude discordant and mixed alignments have been added.
- Quality control / RNA-seq quality metrics with RseQC: Possibility to plot inner distance distribution has been added.
- ChIP-seq and FAIRE-seq / GO enrichment for list of genes: The tool can now analyze mouse, rat, fruitfly and arabidopsis data in addition to human, and it accepts Ensembl identifiers in addition to EntrezGene ones.
- ChIP-seq and FAIRE-seq / Find peaks using MACS: The effective genome sizes have been updated and a possibility to provide a user-defined effective genome size has been added. MACS has been updated to v1.4.2 and v2.0 can be chosen with a parameter.
- All GTF files have been updated to Ensembl 75.
- HTSeq has been updated to version 0.6.1.
- The MeDIP-seq tool category has been temporarily removed for updating.
Updates to microarray analysis tools
- Statistics / PCA: Component loadings are reported.

Version update 21.5.2014: What is new in Chipster 2.12.1

Small bug fix update.

Changes
- Fix Generate phenodata tool

Version update 13.5.2014: What is new in Chipster 2.12

This is a small update which includes a couple of bug fixes and a new genome added to some aligners.

Changes
- Schizosaccharomyces pombe genome (ASM294v2.22) has been added to TopHat2, Bowtie2, Bowtie, Cufflinks, Cuffcompare and Cuffdiff tools.
- Chipster's pdf viewer now works also on computers which have OpenJDK Java.
- A bug that prevented the use of gzipped files on single machine Chipster server installations has been fixed.
- Chipster server automatically rejects analysis tools which have identical names for input and output files.

Version update 7.4.2014: What is new in Chipster 2.11

This is a major update which combines v2.10 and v2.11. It provides 62 new analysis tools, the majority of which are in the brand new sequence analysis module. New tools have been added also in the NGS module, including the Dimont motif finding tools kindly contributed by Jan Grau (Martin Luther University Halle-Wittenberg). Microarray analysis tools have been migrated to use R3.0.2 and the Brainarray custom CDFs version 18, and some NGS command line tools have been updated to new versions. Many existing tools and visualizations have been improved. The Chipster client can now be launched with more memory if needed for visualization.

New sequence analysis module "Misc"
- Data retrieval contains tools for fetching sequences from databases.
- BLAST contains NCBI BLASTs and organism specific BLASTs using Ensembl.
- Sequence alignment contains pairwise and multiple sequence alignment tools such as MAFFT.
- Nucleotide sequence analysis contains tools for primer design, restriction mapping, finding open reading frames, etc.
- Protein sequence analysis contains pattern finding tools.
- Sequence utilities contains various utility tools from the EMBOSS package.
- Phylogeny contains PHYLIP tools.
- Other tools contains for example a tool for retrieving specified columns from a table.
Improvements to visualizations
- You can create subsets of VCF, GTF and BED files in the spreadsheet view. As for TSV files, select the rows, right-click and select "Create dataset".
- Spreadsheet view shows now the first 10,000 rows (instead of 2000).
- Genome browser copes better with high read piles. The maximum read track height has been set to 1000 reads (note that if you set "Coverage scale = automatic", you can check the actual height in the coverage track). You can also launch the Chipster client with more memory (3 GB or 6 GB) to improve viewing.
- Phenodata editor doesn't allow spaces in column names and changes them to underscores.
New NGS analysis tools
- ChIP-seq / Dimont sequence extractor: Extracts genomic regions in the annotated FastA format required by Dimont.
- ChIP-seq / Dimont sequence extractor using own genome: Extracts genomic regions from user's own fasta files.
- ChIP-seq / Find motifs with Dimont: De novo motif discovery with Dimont.
- ChIP-seq / DimontPredictor: Search sequences for matches with a motif discovered by Dimont.
- Quality control / RNA-seq BAM quality metrics with RseQC: Provides splice junction annotation, statistics for genomic features, coverage uniformity plot, and saturation plots for genes and splice junctions.
- RNA-seq / Compare assembly to reference using Cuffcompare: Compares assembly GTF to reference GTF.
- Utilities / Table converter: Extracts and renames table columns.
Updates to NGS analysis tools
- The tool category Filtering has been renamed to Preprocessing and it contains tools for trimming and filtering reads. Tool package names have been added for clarity, as the next release will contain also Trimmomatic tools. Tools for filtering data tables have been moved to the Utilities category.
- Alignment / Align reads with TopHat2: Human transcriptome index has been added in order to make the first alignment step faster.
- RNA-seq / Assemble trancripts using Cufflinks: GTF sorting and the possibility to perform RABT and have been added.
- CNA-seq / Call aberrations from segmented copy number data: If cellularities (proportion of tumor cells in the samples) are known, it is now possible to correct for it by choosing the phenodata column containing the values.
- CNA-seq / Cluster called copy number data: Possible to skip clustering and plot a heatmap in the original sample order.
- CNA-seq / Detect genes from called copy number data: Now uses gene list from Bioconductor instead of CanGEM.
- CNA-seq tools now use a new preprocessing method, QDNAseq, for BAM files.
- PRINSEQ tools for read filtering and trimming use PRINSEQ's own paired end support and, in order to make the output files smaller, they do not fill the quality header lines.
- TopHat2 has been updated to version 2.0.10.
- PRINSEQ has been updated to version 0.20.4.
- DEXSeq read counting uses updated Python scripts.
Updates to microarray data analysis tools
- Normalization / Affymetrix gene arrays: Normalization to genes uses now custom CDFs. Support for the following arrays has been added: PrimeView, Drosophila 1.1, Caenorhabditis elegans, Canis familiaris, Oryza sativa (rice - ricegene1.1st), Human Transcriptome Array (HTA 2.0).
- Normalization / Affymetrix exon arrays: Support for Human Transcriptome Array (HTA 2.0) has been added. Normalization to genes uses now custom CDFs.
- Normalization / Illumina SNP arrays: Support has been added for many chiptypes and GEO format.
- Quality control / Affymetrix - using RLE and NUSE: Support for Affymetrix gene and exon arrays has been added.
- Normalization / Illumina methylumi: Three new parameters have been added to cope better with atypical columns, like those found in GEO.
- Statistics / Two group test: The RankProd method has been added.
- Statistics / Several group test: Phenodata groups are forced to be considered as factors.
- Statistics / Time series: P-value correction has been added to the time series function.
- Visualization / Chromosomal position: You can choose to visualize fold change values instead of expression values.
- Pathways / GO enrichment for miRNA targets: miRNA targets for mouse miRNAs are retrieved from the TargetScan database.
- Copy number aberrations / Call aberrations from segmented copy number data: If cellularities (proportion of tumor cells in the samples) are known, it is now possible to correct for it by choosing the phenodata column containing the values.
- Copy number aberrations / Cluster called copy number data: Possible to skip clustering and plot a heatmap in the original sample order.
- Copy number aberrations / Detect genes from called copy number data: Now uses gene list from Bioconductor instead of CanGEM.
Removed analysis tools
- Several PRINSEQ tools have been combined to tools "Filter reads for several criteria with PRINSEQ" and "Trim reads for several criteria with PRINSEQ" for more efficient analysis.

Version update 4.12.2013: What is new in Chipster 2.9

Chipster 2.9 contains many analysis tool updates and improvements to the genome browser. It also contains some new analysis tools and a tutorial for CNA-seq data analysis kindly contributed by Ilari Scheinin (VU University Medical Center Amsterdam). In order to made it easier for the developer community to integrate new analysis tools in Chipster, this new version provides a graphical tool editor environment and support for Python.

Improvements to Chipster analysis tool development
- Web based Tool editor for making it easier to write Chipster tool headers. In this first version, use copy/paste to get headers in and out of the Tool editor. Future functionality will include viewing and editing existing tools and easy deployment to Chipster. Available also in the Chipster virtual machine.
- Support for analysis tool scripts in Python has been added. Just like for R scripts, you can view the source code of the Python tools, and updated scripts can be instantaneously run in the Chipster client.
- R scripts have a new variable chipster.threads.max, which defines the number of threads for parallel tools. This centralized configuration according to available CPU resources provides shorter run times for analysis tools.
Improvements to genome browser
- You can select features in visualized BED, VCF and tsv files by clicking on them. Information about the selected feature is shown in the side panel.
- BED score and color columns can be visualized.
- High resolution images can be exported.
- Show all reads -functionality is now track-specific.
- Resolved the bug that prevented genome browser visualization with a tsv file.
Improvements to other visualizations
- The chiptype column can now be edited in the Phenodata editor.
New NGS analysis tools
- Utilities / FASTA from BED: Extracts sequences from a FASTA file for each of the intervals defined in a BED file.
Updates to NGS analysis tools
- BWA, Tophat2, Tophat, Bowtie2, Bowtie, Cufflinks, Cuffmerge and Cuffdiff tools: The number of threads has been increased so the tools run faster.
- RNA-seq and miRNA-seq / Differential expression using edgeR for multivariate experiments: Raw counts are added to the output, allowing visualization of expression profiles etc.
- RNA-seq and miRNA-seq / Differential expression using edgeR: Tagwise dispersion is calculated using trended dispersion in order to cope better with outliers.
- Quality control / Read quality with FastQC: BAM files accepted as input.
- CNA-seq / Segment copy number data: Significance threshold parameter for accepting change points has been added.
- BWA, Tophat2, Tophat, Bowtie2 and Bowtie tools: Drosophila melanogaster genome added.
- Tophat2 has been updated to version 2.0.9.
- BEDtools has been updated to version 2.17.0.
- SAMtools has been updated to version 0.1.19.
- VCFtools has been updated to version 0.1.11.
- edgeR-based tools have been updated to version 3.4.0.
New microarray analysis tools
- Visualization / Annotated heatmap: Generates visually advanced heatmaps with sample information such as experimental group or other annotation and sample means. We strongly encourage users to migrate to this tool from the old Heatmap tool, which will become obsolete in the next Chipster version.
- Utilities / Merge expression and phenodata: Integrates expression and phenodata tables to a single output file that can be opened in Excel.
Improvements to microarray data analysis tools
- Pathways / Hypergeometric test for ConsensusPathDB: Support for mouse and yeast data has been added. The genes contributing to the significant pathways are now listed in the output. We thank Atanas Kamburov from the Max Planck Institute for Molecular Genetics for helping with the programmatic access to this tool.
- Statistics / Time series: MaSigPro analyses can now be conducted using any kind of phenodata. Support for replicates has been added to ICA and periodicity methods.
- Normalization / Affymetrix: Support for human U219 array has been added.
- aCGH and CNA-seq / Count overlapping CNVs: Updated to use the latest version of Database of Genomic Variants.
- aCGH / Segment copy number data: Significance threshold parameter for accepting change points has been added.
Removed analysis tools
- Pathways / IntAct and Reactome tools have been removed as EBI has discontinued the programmatic access that these tools used.

Version update 30.8.2013: What is new in Chipster 2.8

Chipster 2.8 contains many improvements to visualizations and some new analysis tools. It requires Java 1.7, and contains a fix to the file import problem that some Windows users experienced.

General
- This version requires Java 1.7. If you experience problems when launching Chipster, please check your Java version.
- A bug causing occasional issues when importing files in Windows has been fixed.
Improvements to genome browser
- Visualization of copy number data: Losses and gains are shown as red and blue boxes, respectively. Frequencies and log ratios are shown as line graph.
- Clearer display of gene names.
- Faster coverage calculation and display.
- Options for showing scale and type (total, strand-specific) of rough coverage at zoomed out level (> 50 kb).
- Spinning circle indicates that more data is being loaded to the track.
- Problems in faulty BAM files are tolerated (e.g. alignment extends past the end of a chromosome)
- Mitochondrial chromosome names M and MT are recognized as identical
Improvements to other visualizations
- Interactive 3D scatter plot has more intuitive color scale and options to change the shape of the data points and the background color (new default is white).
- Interactive 3D scatterplot for PCA has the above mentioned changes and a new color scheme.
- BAM viewer is more tolerant to problems (e.g. alignment extends past the end of a chromosome).
New NGS analysis tools
- miRNA-seq / Correlate miRNA-seq and RNA-seq data. Detects miRNA target genes whose expression correlates with miRNA expression, either negatively or positively.
- Metagenomics / Statistical analysis for marker gene studies. Compares the diversity or abundance between groups using several ANOVA-type of analyses. Makes also an RDA ordination plot and rank abundance and rarefaction curves.
Updates to NGS analysis tools
- RNA-seq / DEXSeq. Dispersion plot, MA-plot and a visualization of genes which contain differentially expressed exons have been added.
- Metagenomics / Classify sequences to taxonomic units using Mothur. Generates now a count table and a phenodata file, which you can use for further statistical analyses.
- Filtering / Filter table by column term. Supports now also tables which have rownames (= the first column doesn't have a title).
- Bowtie2 has been updated to version 2.1.0.
New microarray analysis tools
- Utilities / Delete and subtract columns. Deletes the specified column or columns from the data and subtract their values from their associated samples.

Version update 28.6.2013: What is new in Chipster 2.7

Chipster 2.7 contains important updates to microarray tools and a new NGS tool category Metagenomics, which consists of Mothur tools for investigating bacterial composition from 16 S rRNA data.

New NGS analysis tools
- Metagenomics / Summarize sequences with Mothur. Provides summary information on unaligned or aligned sequences.
- Metagenomics / Trim and filter sequences with Mothur. Trim and filter reads and remove duplicate reads.
- Metagenomics / Align sequences with Mothur. Given a fasta file of 16S rRNA sequences, aligns them to the Silva reference set.
- Metagenomics / Filter sequence alignment with Mothur. Filters out columns from a fasta formatted sequence alignment.
- Metagenomics / Extract unique aligned sequences with Mothur. Removes identical sequences from the alignment.
- Metagenomics / Precluster sequences with Mothur. Preclusters sequences in order to remove sequences that are likely to contain sequencing errors.
- Metagenomics / Remove chimeric sequences with Mothur. Removes chimeric sequences.
- Metagenomics / Classify sequences to taxonomic units with Mothur. Given a fasta file of aligned 16S rRNA sequences, assigns them to taxonomic units (~ species).
- Metagenomics / Extract sequences from SFF file. Helper tool to extract FASTA and QUAL files from a SFF file.
- Metagenomics / Merge FASTA or QUAL files. Helper tool to merge FASTA files or QUAL files together.
Updates to NGS analysis tools
- Dog genome CanFam3 has been updated to Bowtie, Bowtie2 and BWA scripts.
- Cufflinks tools have been updated to v2.1.1
New microarray analysis tools
- Normalization / ComBat - batch normalisation. Enables simultaneous adjustment of multiple batch effects.
- Statistics / Linear modelling using user-defined design matrix. Advanced tool for applying your own design-matrices in limma analyses.
Updates to microarray analysis tools
- Normalization / Illumina. Detection p-value thresholds have been updated in the parameter "Produce flags". If you have used Illumina flag filtering for data from BeadStudio v2 or higher, we recommend rerunning the filter.
- Clustering / Classification. Outputs now also chip-class specific confusion matrices in addition to the class-prediction type of confusion matrices.
- Pearson correlation calculation with different Bioconductor functions in tools Visualization / Dendrogram, Visualization / Heatmap, and Statistics / Time series has been kindly unified by Oliver Heil, DKFZ.

Version update 27.5.2013: What is new in Chipster 2.6

Chipster 2.6 contains important updates to microarray and NGS analysis tools and some new NGS tools.

New NGS analysis tools
- Utilities / Extract samples from dataset. Makes a subset of count table for user defined samples.
- Utilities / Preprocess count table. Allows you to analyze external count tables in Chipster by edgeR and DESeq. Converts the table to Chipster format and produces a phenodata file for it.
Updates to NGS analysis tools
- RNA-seq and miRNA-seq / Differential expression using edgeR. Added filtering option and dispersion plot, made BED output optional, removed fixed prior.n. and the extra MA plots. Updated to R2.15.1-Bioconductor 2.11.
- RNA-seq and miRNA-seq / Differential expression using DESeq. Made BED output optional. Updated to R2.15.1-Bioconductor 2.11.
- RNA-seq / Differential expression using edgeR for multivariate experiments. Added filtering option and dispersion plot, fixed common dispersion method. Updated to R2.15.1-Bioconductor 2.11.
- RNA-seq / Count aligned reads per genes with HTSeq. Added possibility to include genomic location data to count table, which allows edgeR and DESeq to produce BED files. Added possibility to use BAM files with GTF files with different chromosome naming (e.g. chr1 vs 1).
- RNA-seq / Count aligned reads per genes with HTSeq using own GTF. Added possibility to include genomic location data to count table, which allows edgeR and DESeq to produce BED files.
- Utilities / Define NGS experiment. Added a possibility to leave out genomic location columns and to choose the column containing counts.
Genome browser improvements
- Drosophila melanogaster genome (BDGP5.70) has been added.
Updates to microarray analysis tools
- Normalization / Affymetrix gene arrays. Support added for human v2.0 arrays, mouse and rat v2.1 arrays, and zebrafish and arabidopsis v1.1 arrays.
- Normalization / Affymetrix SNP arrays. Support added for Affymetrix SNP 5.0 and 6.0 arrays. CRLMM has been updated to the latest version. cdfName and signal-to-noise ratio filtering features have been added
- Normalization / Illumina SNP arrays. CRLMM method, cdfName and signal-to-noise ratio filtering features have been added.
- Normalization / Random effects. Noise feature has been added.

Version update 11.4.2013: What is new in Chipster 2.5

Chipster 2.5 contains a lot of new analysis and visualization functionality for NGS data and improvements to some microarray analysis tools. The new NGS data analysis tools include the TagCleaner package, F-seq peak detection software, and a large number of CNA-seq analysis tools kindly contributed by Ilari Scheinin (VU University Medical Center Amsterdam).

New NGS analysis tools
- ChIP-seq and FAIRE-seq / Find broad peaks using F-seq. Searches for statistically significantly enriched broad peaks, such as regions of open chromatin.
- Utilities / Predict primers/adaptors (TagCleaner). Predicts adaptor sequences in reads.
- Utilities / Statistics for primers/adaptors (TagCleaner). Calculates the number of adaptor sequences found allowing for different numbers of mismatches.
- Utilities / Trim primers/adaptors (TagCleaner). Trims adaptors allowing for mismatches.
- Utilities / Sort GTF. Sorts GTF files based on chromosomal location.
- Utilities / Sort VCF. Sorts VCF files based on chromosomal location.
- CNA-seq / Correct for GC content. Takes the counts per bin for a CNA-seq data set and corrects them for GC content.
- CNA-seq / Normalize copy number data. Normalizes copy number data before segementation.
- CNA-seq / Segment copy number data. Replaces the tool "Segment and call copy number aberrations".
- CNA-seq / Call aberrations from segmented copy number data. Replaces the tool "Segment and call copy number aberrations".
- CNA-seq / Identify common regions from called copy number data. Reduces dimensionality of called copy number data by identifying common breakpoints.
- CNA-seq / Plot copy number aberration frequencies. Makes a frequency plot of copy number aberrations in each group.
- CNA-seq / Cluster called copy number data. Clusters samples based on copy number aberrations.
- CNA-seq / Group tests for called copy number data. Statistical tests between two or more groups for called copy number data.
- CNA-seq / Survival test for called copy number data. Logrank test for survival with called copy number data.
- CNA-seq / Plot survival curves for called copy number data. Plot Kaplan-Meier survival curves for called copy number data.
- CNA-seq / Detect genes from called copy number data. Converts data from bin-based to gene-based using chromosomal locations.
- CNA-seq / GO enrichment for called gene copy numbers. Hypergeometric test for enrichment of GO terms in frequently aberrated genes.
- CNA-seq / Match copy number and expression features. Matches copy number and expression data using chromosomal locations.
- CNA-seq / Plot profiles of matched copy number and expression. Plots profiles of two priorly matched data sets of copy number and expression.
- CNA-seq / Test for copy number induced expression changes. Nonparametric testing for changes in expression induced by a change in DNA copy number.
- CNA-seq / Plot copy number induced gene expression. Plots the expression level of individual genes for a copy number vs. expression comparison.
- CNA-seq / Add cytogenetic bands. Adds cytogenetic band information using chromosome names and start/end base pair positions.
- CNA-seq / Count overlapping CNVs. Counts overlapping CNVs from the database of genomic variants.
Updates to NGS analysis tools
- Rat genome rn5 (Rnor_5.0) has been added to all aligners.
- miRBase19 for human, mouse and rat have been added to Bowtie and Bowtie2
Genome browser improvements
- Support for GTF files: You can view GTF features in the browser as a separate track. You can also use a GTF file for rapidly navigating through a list of variants in the browser: Open GTF file as a spreadsheet and detach it. Clicking on chromosome position in the spreadsheet navigates the browser to that position in the genome.
- Center line has been added.
- When viewing BAM files, it is not necessary to select the index file (.bai) if it has the same name.
- Rat genome rn5 (Rnor_5.0) has been added.
- Scrolling to the beginning of chromosome has been made easier.
- VCF files can be viewed also in the absence of BAM files.
- Reference sequence is shown also in the absence of BAM files.
- Coverage track shows SNP locations also when the viewing of reads is disabled.
- A bug in showing human MT reverse strand annotations has been fixed.
New microarray analysis tools
- Copy number aberrations / Fetch probe positions from GEO. Fetches microarray probe positions from the GEO database.
- Copy number aberrations / Normalize copy number data. Normalizes copy number data before segementation.
- Copy number aberrations / Plot copy number aberration frequencies. Makes a frequency plot of copy number aberrations in each group.
Updates and bug fixes to microarray analysis tools
- Utilities / Merge tables. Changed to cope better with descriptions incling line-feeds, quotes and other unusual characters.
- Normalization / Process prenormalised Affy. Parameters "keep.flags" and "keep.annotations" have been added to control keeping data-specific flags and annotations.
- Normalization / Illumina. Changes to parameter names.

Version update 4.2.2013: What is new in Chipster 2.4

Chipster 2.4 contains changes to the user interface and analysis functionality. New genomes have been added to several analysis tools and to the genome browser.

Changes to the user interface
- Contact support option has been added to the Help menu. This provides an easy way for the users on CSC's Chipster server to send their analysis sessions for problem solving.
Genome browser improvements
- Annotations and user data have been divided into separate panels and a scrollbar has been added.
- Chicken, cow and human mitochondrial genome have been added.
- Ability to navigate with variant annotation files (all-variants.tsv and coding-variants.tsv) has been added. If you detach the file and click on the value in the START-column, genome browser moves to that location.
New NGS analysis tools
These tools are based on Cufflinks 2.0.2. The previous Cufflinks tool has been renamed "Differential expression using Cufflinks v1.0.3".
Updates and bug fixes to NGS analysis tools
- Variants / Call SNPs and short INDELs. Reference genomes for pig, chicken and cow have been added.
- Variants / Filter and analyze variants. Minimum quality filter has been added.
- Alignment / BWA and Bowtie2 tools: Reference genomes for Arabidopsis lyrata and pig have been added.
- Alignment / TopHat tools: Parameter "Use annotation GTF" has been added.
- Alignment / BWA for paired end reads. A bug in unzipping FASTQ files has been fixed.
- Alignment / BWA-SW for single end reads. This analysis tool has been removed due to a bug in BWA itself.
Updates to microarray analysis tools
- Normalization / Agilent miRNA: This analysis tool has been reintroduced.

Version update 13.11.2012: What is new in Chipster 2.3

Chipster 2.3 contains changes to the user interface and analysis functionality. NGS alignment tool Bowtie2 has been added, and changes have been made to existing alignment tools. Sheep genome oar3.1 has been added to alignment tools and genome browser. Support for some Agilent and Illumina microarrays has been added.

Changes to the user interface
- Naming of datasets in the workflow panel. The "boxes" indicating files are now named with the extension of the file name. As before, the box is colored according to the tool category that produced that file, and the full filename is displayed in the datasets panel. We hope that this change makes selecting files easier in NGS result sets, which typically contain several file types. Please let us know what you think.
New NGS analysis tools
New microarray analysis tools
- Normalization / Methylumi. Supports both Illumina 27k and 450k arrays.
Updates and bug fixes to NGS analysis tools
- Sheep genome oar3.1 has been added to all alignment tools: Bowtie1, Bowtie2, TopHat, TopHat2, BWA.
- Alignment / BWA for paired end reads and own genome. A bug in paired end read handling has been fixed. Please rerun your alignments.
- Alignment / Bowtie(1) for paired end reads, Bowtie(1) with paired end reads and own genome. Bug resulting in empty result files has been fixed.
- Utilities / Filter reads for adaptors, length and Ns (FastX Clipper). New parameters have been added.
Updates to microarray analysis tools
- Normalization / Agilent 1-color, Agilent 2-color: Support has been added for Agilent Fruit fly and Human G4851A arrays.

Version update 27.9.2012: What is new in Chipster 2.2.0

Chipster 2.2.0 contains a new tool category Variants with tools for analyzing VCF files. Visualization of these files is supported already (see the 12.9.2012 release notes). There are also new tools for RNA-seq analysis, and many tools have been updated to use Ensembl v68 annotations. The full list of NGS analysis tools is available here.

New NGS analysis tools
- RNA-seq / Map aligned reads to exons for DEXSeq. This tool counts the reads that fall into each non-overlapping exonic part.
- RNA-seq / Differential exon expression using DEXSeq. This Bioconductor-based tool infers differential exon usage.
- Variants / Call SNPs and short INDELs. This SAMtools and bcftools -based tool combines the old variant calling tools, allowing you to call variants from one or multiple individuals.
- Variants / Filter and analyze variants. This VCFtools-based tool allows you to filter and analyze variants in a VCF file.
- Alignment / TopHat2 for paired end reads. This tools uses TopHat 2.0.4. and Bowtie 2. We will keep TopHat 1.3.0 is available for a while, but encourage you to use TopHat 2 for new analysis sessions.
- Alignment / TopHat2 for single end reads. As above.
Updated NGS analysis tools
- RNA-seq / Differential expression with Cufflinks. Added mouse genome mm10, updated GTFs to Ensembl 68. Note that Cufflinks version is still 1.0.3, but we are working on updating it to 2.0.2.
- RNA-seq / Map aligned reads to genes using HTSeq. Added mouse genome mm10, updated GTFs to Ensembl 68.
- Alignment / TopHat for paired end reads and single end reads. Updated GTFs to Ensembl 68.

Version update 12.9.2012: What is new in Chipster 2.1.0

Chipster 2.1.0 contains major improvements in the genome browser and important new NGS data analysis functionality (see below). The full list of NGS analysis tools is available here.

Genome browser improvements
- Several new genomes added.
- Support for VCF files: You can view variant positions in the browser as a separate track. You can also use a VCF file for rapidly navigating through a list of variants in the browser: Open VCF file as a spreadsheet and detach it. Clicking on chromosome position in the spreadsheet navigates the browser to that position in the genome.
- Improved visualization of INDELs.
- Improved visualization of automatically calculated coverage.
- Links to Ensembl and UCSC genome browser.
New NGS analysis tools
- RNA-seq / Differential expression using edgeR for multivariate experiments. This tool complements the existing edgeR by allowing you to analyze data from more complex experimental designs.
- Utilities / Annotate variants. This R/Bioconductor-based tool allows you to annotate human variants in a VCF file.
Updated NGS analysis tools
- PRINSEQ QC, filtering and trimming tools have been updated to use PRINSEQ version 0.19.3, which makes them run faster.
- Paired-end support of the tool "Filter reads for several criteria" has been updated to support HiSeq FASTQ format.
- Bowtie: Mouse genome mm10 has been added.
- TopHat: Mouse genome mm10 has been added.

Version update 6.7.2012: What is new in Chipster 2.0.2

Chipster 2.0.2 contains new analysis tools for variant calling (SAMtools) and quality control, trimming and filtering of reads (PRINSEQ). The full list of NGS analysis tools is available here.

New NGS analysis tools
- Call SNPs and short INDELs from one diploid individual
- Read quality statistics with PRINSEQ
- Filter reads for length (PRINSEQ)
- Filter reads for Ns (PRINSEQ)
- Filter reads for low complexity (PRINSEQ)
- Filter reads for duplicates (PRINSEQ)
- Filter reads for several criteria (PRINSEQ). This tool combines the above mentioned filters, and also includes a possibility to filter paired reads so that the filtered files contain the paired reads in matching order.
- Trim reads by quality (PRINSEQ)
- Trim reads for poly-A/T tails (PRINSEQ)
- Trim reads for several criteria (PRINSEQ)
Updated NGS analysis tools
- All aligners: Unzipping has been added, so you can give zipped FASTQ as input.
- Bowtie: Genomes for Dog (UCSC canFam2) and Gasterosteus aculeatus (BROADS1.67) have been added.
- TopHat: Parameter for the standard deviation of inner distance has been added.
New microarray analysis tools
- Annotation / Add genomic location to data. Adds chr, start and end location to the data.
Updated microarray analysis tools
- Pathways / Hypergeometric test for GO. If the data was normalized using custom chiptype (altCDF) for chips hgu133a or hgu133a2, the same mapping is used in testing.

Version update 24.1.2012: What is new in Chipster 2.0

Chipster 2.0 contains a comprehensive collection of analysis tools for next generation sequencing (NGS) data. Visualization options now include a built-in genome browser, allowing you to view reads and results in their genomic context. Importantly, also the analysis session handling has been improved.

NGS functionality
.
The following analysis packages and several R-based tools are now available under the NGS tab in the Analysis tool panel. For more information about the individual tools, please see the NGS tool manual.
- Quality control: FastQC and FASTX.
- Utilities and genomic region matching: BEDTools and SAMtools.
- Alignment: Bowtie, BWA and TopHat.
- RNA-seq: edgeR, Cufflinks, HTSeq.
- miRNA-seq: edgeR, pathway analysis for target genes, correlate with target expression.
- ChIP-seq: MACS, motif detection and matching against JASPAR, retrieval of nearby genes and pathway analysis for them.
- CNA-seq: Count and plot copy number profile, compare to reference.
- MeDIP-seq: MEDIPS.
Built-in genome browser
- Allows you to view reads and results in their genomic context using Ensembl annotations.
- Zooms into nucleotide level and highlights SNPs.
- Calculates coverage automatically.
Improved analysis session handling
- Analysis sessions are automatically saved on the background, so if your session stops unexpectedly, you can always retrieve your data from the server.
- Analysis sessions use a new format so that saved sessions are functional even if the analysis tools that were used for creating them have changed meanwhile. The new sessions have ending .zip. The old sessions (.cs) won't work in Chipster 2.0, but we will keep the Chipster version 1.4.7 running so that people can access also their old sessions. Note that Chipster 2.0 RC was the release candidate version of Chipster 2.0, so those sessions are compatible. We strongly encourage all users to move to Chipster 2.0 now, and use Chipster 1.4.7 only when working with old .cs sessions.

Version update 16.5.2011: What is new in Chipster 1.4.7

The main change is new R/Bioconductor and annotation packages: All the R/Bioconductor-based analysis tools have been updated and now run under R 2.12.1 and Bioconductor 2.7. Please note that while Chipster 1.4.7 is focused on microarray and proteomics data analysis, new tools for next generation sequencing data are constantly added to Chipster 2.0.

New analysis tools
- Utilities / Extract data for miRNA targets. This tool extracts gene expression data from a gene expression data set, based on the result list from one of the miRNA to gene expression correlation tools.
Changes to analysis tools
- Utilities / Search by gene name: A new parameter to control whether to perform exact matching for the query term has been added.
- Utilities / Calculate fold change: New parameters have been added to give users the choice to use either arithmetic or geometric mean and whether to output the results in linear or base 2 logarithmic scale.
Obsolete analysis tools
- Utilities / Extract genes from clustering: This tool has been dropped since its functionality is now provided in the tool Preprocessing / Filter using a column value.
- Utilities / Extract genes using a p-value : This tool has been dropped since its functionality is now provided in the tool Preprocessing / Filter using a column value.
- Utilities / Filter by column: This tool has been dropped since its functionality is now provided in the tool Preprocessing / Filter using a column value.
- Utilities / Merge tables : This tool has been dropped since its functionality is now provided in the tool Utilities / Merge datasets.
Visualization changes
- All tools that generate static graphical visualizations have been modified to yield output in PDF format, allowing for high-quality and scalable plots.

Version update 7.12.2010: What is new in Chipster 1.4.6

New analysis tools
- Utilities / Extract genes from KEGG pathway. This tool can be used to retrieve the genes that map to a pathway determined significant by pathway analysis tools such has Hypergeometric test for KEGG and Gene set test. Note that there is a similar tool also for GO analysis results (Utilities / Extract genes from GO term)
Changes to analysis tools
- Clustering / Hierarchical: The maximum number of genes/samples to be clustered is increased to 20 000.
- Visualization / Dendrogram: The maximum number of samples to be clustered is increased to 20 000.
- Visualization / Heatmap: The maximum number of genes/samples to be clustered is increased to 20 000.
- Utilities / Sort samples: This tool was originally intended for ordering samples in a desired way for publication images of result gene lists, and as such it was not suitable for being used prior to statistical testing. It has been modified so that it now creates a new phenodata file, making it suitable to be used at any stage of analysis.

Version update 14.10.2010: What is new in Chipster 1.4.5

This release completes the aCGH analysis functionality in Chipster. The aCGH functionality, kindly contributed by Ilari Scheinin (University of Helsinki), has now passed the beta testing phase, and the tools also have their manual pages. Please note that as "beta testing" has been removed from the category name, the workflows created with the test version won't be functional.

New analysis tools
Changes to analysis tools
- Pathways / Hypergeometric test for GO: Now includes parameters for p-value adjustment method, GO category type (BP, MF, CC), minimum size of category, and conditional testing. Results are given both as html and as a table which can be used for further filtering.
- Pathways / GO enrichment for miRNA targets: Now includes parameters for p-value adjustment method, GO category type (BP, MF, CC), minimum size of category, and conditional testing. Results are given both as html and as a table which can be used for further filtering.
- Annotations / Find miRNA targets: Modified the output to allow use of downstream tools, such as Venn Diagram visualization to compare output from different gene target databases and select consensus genes.
- Preprocessing / Filter using a column value: Removed the restrictive range from the cutoff parameter.
- Statistics / Calculate descriptive statistics: Modified the behavior for chips so that it now calculates the statistics for all column.
- Statistics / Gene set test: Modified to exclude single gene gene sets and added group labels in plots and added more columns with information to results table.
- Utilities / Combine probes to genes: Modified to work on input that contains results from the Annotations / Add annotations to data tool.
- Utilities / Average replicate chips: Modified so that it generates a new phenodata file, enabling downstream analysis of the averaged data.
- Utilities / Extract samples from dataset: Modified to include annotations and to exclude samples with no class assigned.

Version update 20.4.2010: What is new in Chipster 1.4.4

Fixed the 3D scatter plot bug of Chipster 1.4.3

Version update 9.4.2010: What is new in Chipster 1.4.3

New analysis tools

This release completes the miRNA analysis functionality in Chipster and also includes a whole new set of tools for aCGH data. The aCGH functionality, kindly contributed by Ilari Scheinin (University of Helsinki) also allows to integrate aCGH data with expression data.
- Annotation / Find miRNA targets: Fetches the predicted gene targets for a list of miRNA identifiers in miRanda, miRBase, miRtarget2, PicTar, TarBase and TargetScan.
- Statistics / Up-down analysis of miRNA targets: Given a miRNA expression dataset and a mRNA expression dataset for a two-group comparison experiment, this tool identifies the genes whose expression is down-regulated in response to an up-regulated miRNA, or vice-versa.
- aCGH tools: This category is labelled as beta testing because there might still be some changes to the tool names and the manual pages are not yet available. To read more please see recent course slides. The CanGEM database is used for data import, and probe position and cytoband information. Note that for analyzing your own aCGH data, you can normalize it with the Normalize / cDNA tool and then fetch the probe positions from CanGEM to allow the subsequent analysis. The following tools are available:
  - Import from CanGEM
  - Call copy number aberrations from aCGH data
  - Plot copy number profiles from called aCGH data
  - Identify common regions from called aCGH data
  - Test for DNA copy number induced differential expression
  - Plot combined profiles of copy number and expression
  - Plot copy-number-induced gene expression
  - Fetch probe positions from CanGEM
  - Add cytogenetic bands
  - Count overlapping CNVs: Compares the found CNVs to the Database of Genomic variants
  - Sample size calculations with an adapted BH method
- Preprocessing / Filter using a column value: Filters the data based on values in the specified column. This can be used for filtering e.g. by fold change after statistical testing: Selecting "outside" and a cutoff of 1 would give all the genes that are two fold up- or down-regulated (as the FC column is in log2 scale).
- Preprocessing / Filter using a column term: Filters data based on terms in a specified text column. This can be used for example for retrieving genes belonging to a certain pathway or gene ontology category after running the "Add annotations to data" tool.
- Normalize / Process prenormalized Affy: Converts normalized Affymetrix data to Chipster format and creates a phenodata table for it.
- Utilities / Intersect lists: Identifies the intersection, or union, between two or three data tables that share one or more columns with common identifiers.
Changes to analysis tools
- Normalization / Agilent: Flags can be handled in normalization and subsequently used in analysis.
- Utilities / Search by gene name: Option to include or exclude the specified genes.
- Utilities / Sort samples: Gene symbols and descriptions are kept in the result file.
- Utilities / Combine probes to genes: Gene symbols and descriptions are kept in the result file.
Changes to visualizations
- Venn diagram: Possibility to combine datasets using the gene symbol column, instead of the identifier column. This allows you to intersect gene lists from different array platforms
- 3D scatter for PCA: Possibility to color genes according to cluster, p-value, fold change etc.
- Volcano plot: The y-axis -log(p) has been changed to use log10 instead of ln.

Version update 11.2.2010: What is new in Chipster 1.4.2

New functionality
- Ready-made analysis workflows for miRNA and proteomics data. They can be started from the top panel menu Workflow/ Run from Chipster repository. More information about the content of the workflows is available here
- Statistics / Correlate miRNA with target expression. Read more....
Changes to analysis tools
- Clustering / KNN classification: Improved validation of classifiers with a test set.
Changes to visualizations
- Hierarchical clustering: Color scheme has been changed to blue-red in order to cater for color blind users. If you prefer the old green-red scheme, please note that the colors can be easily changed by right-clicking on the heatmap and selecting Properties /Plot / Heatmap coloring.
- Possibility to add gene / protein annotations to visualizations. If you are analyzing custom chip or proteomics data, Chipster cannot automatically generate the gene symbols displayed in visualizations. However, you can now mark any column as annotation in the Import tool, and these annotations will be diplayed together with identifiers in the interactive visualizations.

Version update 20.1.2010: What is new in Chipster 1.4.1

New analysis tools
- Pathways / GO enrichment for miRNA targets. Read more...
- Pathways / KEGG enrichment for miRNA targets. Read more...
- Utilities / Import from ArrayExpress. This tool imports Affymetrix raw data directly from ArrayExpress and normalizes it using the RMA algorithm.
- Quality control / Affymetrix exon arrays - using RLE and NUSE
- Normalization / Normalize to chip average
- Normalization / Normalize to gene average
Changes to analysis tools
- Pathways / Association to Reactome pathways. Support for UniProt identifiers added to enable analysis of proteomics data.
- Pathways / Protein interactions from IntAct. Support for UniProt identifiers added to enable analysis of proteomics data.
- Pathways / Hypergeometric test for ConsensusPathDB. Support for UniProt identifiers added to enable analysis of proteomics data.
- Pathways / Hypergeometric test for GO. In order to avoid identification of directly related GO terms with considerable overlap of genes, this tool now uses a conditional method which removes the genes of the significant child categories before testing their parents.
- Pathways / Hypergeometric test for cytobands. In order to avoid identification of related regions with considerable overlap of genes, this tool now uses a conditional method which removes the significant genes of subregions before testing the larger regions.
- Quality control / Affymetrix basic. Spike-in performance plot added.
- Annotation / Agilent, Affymetrix or Illumina gene list. Possibility to annotate lists of identifiers (with no expression data)
- Statistics / Linear modelling. Parameter called "significance" has been removed, as p-values are always calculated for interactions if they are taken into account in the model.
- Fixes to promoter analysis tools so that they cope better with R2.9.
Visualization news:
- 3D scatter plot for PCA: Samples can be colored based on phenodata columns which contain text (in addition to numbers).

Version update 11.11.2009: What is new in Chipster 1.4.0

New R/Bioconductor version, updated annotation packages

All the analysis tool scripts have been updated to use R 2.9. Consequently the annotation and pathway tools have been updated to use annotation packages in the .db -format. The following versions of the annotation packages are currently in use:
New analysis tools
- Pathways / Hypergeometric test for ConsensusPathDB. Read more...
- Pathways / Association to Reactome pathways. Read more...
- Pathways / Protein interactions from IntAct. Read more...
- Normalization / Agilent miRNA. Read more...
Changes to analysis tools
- Normalization / Affymetrix exon arrays: Gene symbol and description columns are added, when the data is normalized at gene level. Larger data sets can be analyzed as the normalization is now running on a cluster node with 16 GB of memory.
- Normalization / Affymetrix gene arrays: Larger data sets can be analyzed as the normalization is now running on a cluster node with 16 GB of memory.
- Normalization / Process prenormalized: Additional annotation columns are retained in the data.
Visualization news:
- 3D scatter plot for PCA: After running principal component analysis (PCA) for samples as a quality control, this visualization allows to color the data points according to experimental group (or another column in the phenodata table, such as batch).
Import tool can be used to convert decimal separator (e.g. from comma to dot) in the files to be imported to Chipster
Bug fixes to session and workflow functionality.

Version update 17.7.2009: What is new in Chipster 1.3.0

Support for new chip types
- Affymetrix Human, Mouse and Rat Gene 1.0 ST arrays. These arrays can be normalized and annotated (see below). Of the Affymetrix quality control tools only the RLE - NUSE tool is currently suitable for them.
New analysis tools
- Normalization / Affymetrix gene arrays
- Annotation / Affymetrix gene ST genelist
Visualization news:
- Hierarchical / heatmap:
  - Possibility to select genes (and create a new dataset out of them) by drawing a box on the heatmap
  - The image includes gene symbols in addition to the probe names
  - Bigger image with scroll bars to view large heatmaps. Ticking the box "Fit to the screen" shows the whole heatmap.
  - Changing the heatmap colors (right click and select properties/ plot/ heatmap coloring) has been made clearer
- Expression profiles: Possibility to select genes (and create a new dataset out of them) by drawing a box on the image

Version update 27.3.2009: What is new in Chipster 1.2.4

New analysis tools
- Statistics/ Ordination-ca: Performs a detrended correspondence analysis. This ordination method can be used for example in quality control and time series analysis.
- Clustering/ Classification: Adds many new classification methods to Chipster, including variaties of discriminant analysis, neural nets, support vector machines and naive Bayes. The tool does not yet implement validation using a separate test set, only cross-validation.
- Annotation/ miRNA target annotation: Provides miRBase IDs and target predictions (by TargetScan and MIRANDA) for probes from Agilent miRNA arrays.
- Annotation/ Add annotations to the data: Provides the same annotations (chromosome location, pathway involment etc) as the Annotate gene list -tool, but the annotation columns are appended to the expression data file.
- Normalization/ Normalize to specific genes: Normalizes the data to specific genes given in a separate gene identifier list. The identifier list must have a title row with text "identifier" and contain gene identifiers used in the data file. An average of these genes is calculated, and the expression values of all genes are adjusted using this average.
- Normalization/ Process prenormalized: Allows an easy import of data which has been normalized by some other software. This tool converts the data to Chipster format (by adding the text "chip." in front of the expression value columns) and generates the phenodata file. The normalized data file needs to be imported to Chipster using the Import tool.
- Pathways/ Hypergeometric test for cytobands: Performs a hypergeometric test for enrichment of genes to certain chromosomal positions (cytobands).
- Utilities/ Sort genes: Sorts genes based on a specified column. (As before, you can also sort genes in the spreadsheet view of the visualization panel by clicking on the column names. However, as the visualization panel only shows max 20 000 rows at the time, it is not able to sort larger data sets).
- Utilities/ Extract genes: Extracts the specified number of genes from the top of the data table. For example, one can first run the Calculate descriptive statistics -tool, sort genes based on the standard deviation column, and then extract the 50 top-most genes.
Support for new chip types
- Illumina Human HT-12 (when importing these files to Chipster with the Import tool, select the PROBE_ID column as an identifier)
- (Agilent miRNA arrays, see the annotation tool described above)
Solved problems
- In some institutions Chipster started normally but no tools could be used (empty error message). This was caused by a proxy server restricting network traffic. This problem has now been solved by a proxy bypass feature.
- Venn diagram didn't allow creating new datasets if one of the files contained only gene identifiers (and no expression columns). This has now been fixed, allowing users to filter datasets with their own gene lists. The column containing the identifiers has to be named "identifier".

Version update 5.2.2009: What is new in Chipster 1.2.3

Chipster is updated automatically also when you start it using the desktop icon (but we still recommend you to launch it from the web page, as this way you can read the latest announcements as well)
Normalization/ Affymetrix exon arrays: Updated the script to use Bioconductor 2.2 in order to overcome the affyio bug.

Version update 8.1.2009: What is new in Chipster 1.2.2

New analysis tools
- Statistics/ Adjust P-values: Adjusts raw p-values in the selected column for multiple testing using a specified method.
- Normalization/ Normalize to specific samples: Normalizes data to specific samples. The samples to be normalized are coded with 1 in one column of the phenodata. The samples to be normalized to are coded with 0 in the same column.
- Normalization/ Illumina SNP arrays: Illumina SNP array preprocessing. Input should be a tab-delimited text file with genotype calls. Typically such a file is created using GenCall software from Illumina.
- Statistics/ Association analysis: Association tests for normalized SNP array data. Runs a Chi square test for every SNP. Hardy-Weinberg equilibrium is tested in controls only. Association tests use the grouping information of sample in group column of phenodata. Association tests are run for genotype frequences and dominant and recessive models.
- Utilities/ Combine probes to genes: Calculates an average for probes or probesets for each gene in the dataset. The data file has to have a symbol column for this to work correctly. After running this tool, only expression values and gene symbols are retained in the data, all other columns and information are lost.
Changes to analysis tools
- Preprocessing/ Filter by CV: The genes whose CV is bigger than the median CV are kept.
- Normalization/ Illumina: The normalization method parameter has been renamed to "normalize.chips".
- Normalization/ Illumina -lumi pipeline: The normalization method parameter has been renamed to "normalize.chips".
- Promoter analysis/ Weeder: Filtering added so that if there are several probe sets for the same transcript (RefSeq id), the promoter sequence is used only once.
- Promoter analysis/ Cosmo: Filtering added so that if there are several probe sets for the same transcript (RefSeq id), the promoter sequence is used only once.
- Promoter analysis/ ClusterBuster: Filtering added so that if there are several probe sets for the same transcript (RefSeq id), the promoter sequence is used only once.
- Promoter analysis/ Retrieve promoters: Filtering added so that if there are several probe sets for the same transcript (RefSeq id), the promoter sequence is retrieved only once.
Visualization news:
- Venn diagram: Gene symbols added to the gene list in the "selected" tab, better merging in data set creation
- 3D scatter plot: Gene symbols added to the gene list in the "selected" tab
- Volcano plot: Better scaling, gene symbols added to the gene list in the "selected" tab
- Phenodata editor: Possibility to copy the contents of one cell to many cells

Version update 11.11.2008: What is new in Chipster 1.2.1

New functionality and visualization:
- Interactive volcano plot
Changes to analysis tools
- Statistics/ Linear modelling: More descriptive parameter names. Phenodata column names are added to the result files for clarity. P-values and fold changes are given also as separate files, so that they can be used for box plots, clustering, etc.
- Statistics/ NMDS: The sample names are colored according to the group column from the phenodata. An additional result file is created where the sample names are taken from the description column of the phenodata (defaults to original file names, can be renamed in Chipster).
- Statistics/ One sample tests: By default the chips are scaled to the same mean before running the test, but this can be avoided by the setting the new parameter scale.to.same.mean to "no".
- Utilities/ Calculate descriptive statistics: Writes the descriptives also in a separate file, which can be used for drawing histograms and boxplots.
- Preprocessing/ Filter by expression: By default the chips are scaled to the same mean before filtering, but this can be avoided by the setting the new parameter scale.to.same.mean to "no".
- Normalization/ Agilent 1-color: The normalization method parameter has been renamed to "normalize.chips".
Annotation support for new chip types:
- Illumina Human V3
- Illumina Mouse V2
- Agilent Zebrafish V2
- Agilent Rice (not comprehensive annotations)

Version update 13.10.2008: What is new in Chipster 1.2.0

New analysis tools
Visualization news:
- Possibility to create gene lists by selecting datapoints from images
- Interactive Venn diagram
- Data points selected in one visualization stay selected in the next one
- Possibility to open visualizations in a separate window
- Possibility to change sample names in visualizations
- Gene symbols are automatically available in 2D scatter plot, SOM visualization and spreadsheets
- Expression profiles are colored according to the expression level
- Annotations for data points selected in images is more comprehensive and not limited to Affymetrix data
Support for new array types
Possibility to save multiple analysis sessions (workspaces)
More flexible analysis workflow saving
General improvements
- Improved Workflow view: automatic sizing, better layout
- The limit of concurrent analysis jobs per user is increased from 5 to 10
- For a complete list if changes, please see the release notes.

New analysis tools

ROTS (reproducibility-optimized test statistic) ranks genes in order of evidence for differential expression for two-group comparisons. This tool was kindly contributed by Dr Laura Elo (please cite Elo L, Fil�n S, Lahesmaa R and Aittokallio T. 2008 IEEE/ACM Transactions on Computational Biology and Bioinformatics 5: 423-431). Read more...
SAFE is a tool for analysis of over- or under-representation of genes in KEGG pathways. It takes both over-representation and expression into account, and the user can define the minimum pathway size to be considered. Read more...
lumi pipeline normalization for Illumina data. Lumi offers new normalization methods such as rsn and loess. It uses BeadSummaryData files as input, so your raw data must be in one file. The filename must end with ".txt", and you should NOT use the Import tool for bringing the data into Chipster (in the Import files -window change the action to "Import directly"). Read more...
Delete columns enables deletion of unwanted columns from the data, for example after using the Merge tables -tool. Read more...
Random sampling enables random sampling of genes or chips from the data, for example before hierarchical clustering of large datasets.

Visualization news

Possibility to create gene lists by selecting datapoints from images. You can create new gene lists by selecting data points in 2D and 3D scatter plots, Venn diagram and spreadsheet. After selecting data points with the mouse, go to the Selected-tab and click on the "Create dataset" button (for spreadsheets highlight the gene identifiers, right click, and choose "Create dataset").
Interactive Venn diagram. If you select 2-3 datasets (by keeping the control key down) you can visualize them as an interactive Venn diagram and thus create new datasets based on the image.
Data points selected in one visualization stay selected in the next one. Selecting data points for example in a scatter plot will also highlight the same genes in a subsequent spreadsheet view and vice versa.
Possibility to open visualizations in a separate window. Clicking the "Detach" button will open your current visualization in a separate window. This allows you to have several visualizations open at the same time.
Possibility to change sample names in visualizations. The phenodata file has a new column called Description. This column is used for sample names in hierarchical clustering and expression profile visualizations. By default it contains the original chip names, but you can type in any description you like.
Gene symbols are automatically available in 2D scatter plot, SOM visualization and spreadsheets. During normalization both gene symbol and description is added to the spreadsheet.
Expression profiles are colored according to the expression level
Annotations for data points selected in images is more comprehensive and not limited to Affymetrix data. The Annotate button in the Selected-tab of 2D/3D scatter plots and Venn diagram now creates a gene list of the selected data points and annotates it using Chipster's annotation tool, which is based on Bioconductor (instead of launching the GeneCruiser service by Broad Institute).

Support for new array types

Human Genome U133 Plus 2.0 Array. Chipster's Affymetrix normalization and quality control scripts have been changed to use R2.7.1 in order to allow these functionalities also for Human Genome U133 Plus 2.0 Arrays.
Agilent drosophila
Agilent rhesus monkey
Note that even if your array is not listed in supported chip types, you can still analyze it with Chipster . For other Illumina and Agilent arrays (such as miRNA arrays) simply choose "empty" as a chiptype during normalization, and Chipster will automatically calculate a mean of all probes which have the same identifier. The only exception is Affymetrix chips, because they require the CDF and probe packages for the summarization to work.

Possibility to save multiple analysis sessions (workspaces)

In order to continue your work later on, you have to save your analysis session (workspace). Saving the session will save all the datasets and their relationships. In Chipster 1.2, a session is packed into a single compressed file with an extension .cs (for Chipster Session). This file is saved on your computer, but you can also take it with and continue your work on another computer by copying the session file there. Session files also allow you to share your work with a colleague. Chipster 1.2 allows you to save multiple analysis sessions separately, and you can save the session files anywhere you like.

To save a session select File->Save session. A previously saved session can be loaded by selecting File->Open session. By default the current data is cleared before another session is loaded, but you can also combine sessions by selecting "Add to current session" from the session file dialog.

Note! Sessions are an extended version of the previous workspace system. If you have saved a workspace with an earlier Chipster version, you can open it by selecting File->Open workspace (session) saved with Chipster 1.1. Unlike the old workspace system, the new session system also allows you to create workflows from datasets that were loaded from a session and you can view all the details for them (including the source code) in the analysis history.

More flexible analysis workflow saving

Workflows allow you to automate your analysis steps, and also share analysis pipelines with collaborators. Workflow is a description of the analysis steps that you've run to the currently selected dataset. If you have run a workflow that you would like to reuse or perhaps share with a colleague, you should save it by selecting its starting point data set and choosing Workflow->Save starting from selected. In Chipster 1.2 you can save the workflow file anywhere you like. You can also change its name, but the ending has to be .bsh.

You can apply the same workflow to another normalized dataset by selecting Workflow->Open and run, or Workflow->Run recent (if you saved the workflow during the same analysis session or if it is located under nami-workfiles in the chipster-scripts -folder).

General improvements

Improved Workflow view: automatic sizing, better layout
The limit of concurrent analysis jobs per user is increased from 5 to 10