Normalization / Random effects (LME)

Description

Removes batch effect from the data.

Parameters

Phenodata column for group effects
Phenodata column for random effects
Error handling method
Standard deviation to be used in generating random noise

Details

This tool removes random (batch) effects, e.g. where samples cluster according to preparation day instead of the biological group under study, using a linear mixed modelling approach. In order to use this tool, you need to add a new phenodata column where you indicate the random effect with numbers (e.g. 1 = hybridization day for the first batch, 2 = hybridization day for the second batch, etc). Please note that this kind of random effects modelling is recommended only when you have at least four different values in your random effect column.

One linear mixed-effect model is fit to every gene, and the residuals of the models are returned. These residuals are the corrected expression values that can then be further analysed as usual. At the moment, only random intercept is allowed for random effects.

For some genes, especially if having only few replicates per batch, fitting of linear mixed-effects models can fail. At present, genes causing failures can be handled in three ways. First, failing genes can be removed from the data. Second, expression values of failing genes can replaced with NAs in the output. Third, a small amount of random noise can be added to the authentic expression values of the failing genes prior the fit. If using the last option, a normal distribution with mean equal to zero and standard deviation equal the random.noise parameter is used to generate random numbers added to the expression original expression values

Output

A tab-delimited text file containing gene names, expression estimates and call values ("flags"). This file is suitable for all further analyses.