Find the markers for a specific cluster compared to another cluster(s).

- Number of the cluster of interest [1]
- Cluster to compare to [all others]
- Min.pct [0.25]
- Differential expression threshold for a cluster marker gene [0.25]
- Which test to use for finding marker genes [wilcox]

Seurat function FindMarkers is used to identify positive and negative marker genes for the clusters of interest, determined by the user. By default, differentially expressed genes are tested between the cluster of interest and all the other cells by default. User can also define to compare the cluster of interest to another cluster or clusters by typing the numbers of the clusters to compare to in the parameter field. When comparing to group of clusters, separate the clusters with a comma (,).

You can filter out genes prior to statistical testing by requiring that a gene has to be expressed in at least a certain fraction of cells in either of the two groups (min.pct=0.25). You can also require that the change in expression has to be at least certain percentage between the groups (thresh.test=0.25). Both of these parameters can be set to 0, but with a dramatic increase in time since this will test a large number of genes that are unlikely to be highly discriminatory.

The marker genes for each cluster are written in the **markers.tsv** file.

Seurat currently implements the following tests:

- "wilcox": Wilcoxon rank sum test (default)
- "bimod": Likelihood-ratio test for single cell gene expression, (McDavid et al., Bioinformatics, 2013)
- "roc": Standard AUC classifier
- "t": Student's t-test
- "tobit": Tobit-test for differential gene expression (Trapnell et al., Nature Biotech, 2014)
- "poisson": Likelihood ratio test assuming an underlying poisson distribution. Use only for UMI-based datasets
- "negbinom": Likelihood ratio test assuming an underlying negative binomial distribution. Use only for UMI-based datasets
- "MAST": GLM-framework that treates cellular detection rate as a covariate (Finak et al, Genome Biology, 2015)

The **markers.tsv** result file contains marker genes and associated statistics **for all the clusters**:

**p-val**= p-values for the differentially expressed genes (larger the p-value -> higher the likelihood that the gene is in the list just be chance)**avg_logFC**= average log fold change (how much higher (lower) the expression of this gene is in the particular cluster, compared to all the other cells**pct.1**= what percentage of the cells in the particular cluster show some expression for this gene**pct.2**= what percentage of the cells**not**in the particular cluster (=all the other cells) show some expression for this gene**p-val_adj**= adjusted/corrected p-value. This value is multiple testing corrected: when we test over thousands of genes, we would statistically start getting some significantly differentially expressed genes just by chance. There are different methods to correct for this, here a*Bonferroni*correction is used. When filtering the table and reporting your results, use this value.

For more details, please check:

The Seurat tutorials

- markers.tsv : Marker genes for the cluster of interest