Performs a statistical test for enrichment of GO terms in a query list of genes. The input file should come from the tool "Find unique and annotated genes".
This tool takes a list of Entrez Gene or Ensembl identifiers, like the unique-genes-list.tsv output file produced by the "Find nearest genes for regions" tool. A hypergeometric test is used to check for over-represented (or under-represented) GO categories relative to the whole genome. The analysis can be restricted to GO terms which have a minimum, user-specified number of genes mapped to them (by default 5).
In order to avoid the identification of directly related GO terms with considerable overlap of genes, this tool uses by default conditional testing: First the leaves of the graph (those terms with no child terms) are tested. All genes annotated at significant children are then removed before testing the parent terms. This continues until all terms have been tested. It is also possible to run the tests in an unconditional fashion by setting the appropriate parameter. In that case it is also possible to apply multiple testing correction.
Output is a text file and an HTML file with a list of significant categories.