. 2022 May 31;23:207. doi: 10.1186/s12859-022-04739-2

Table 5.

Comparing the data structures to compute the goProfiles test and the one based on enrichment contingency tables.

	Non-enriched in both lists			Enriched only in list 1			Enriched only in list 2			Enriched in both lists
GO term number	1	$\dots$	$a = n_{00}$	$a + 1$	$\dots$	$b = a + n_{10}$ $= n_{. 0}$	$b + 1$	$\dots$	$c = b + n_{01}$	$c + 1$	$\dots$	$c + n_{11}$ $= n$
Annotation frequency in gene list 1	$F_{11}$	$\dots$	$F_{1 a}$	$F_{1 (a + 1)}$	$\dots$	$F_{1 b}$	$F_{1 (b + 1)}$	$\dots$	$F_{1 c}$	$F_{1 (c + 1)}$	$\dots$	$F_{1 n}$
Annotation frequency in gene list 2	$F_{21}$	$\dots$	$F_{2 a}$	$F_{2 (a + 1)}$	$\dots$	$F_{2 b}$	$F_{2 (b + 1)}$	$\dots$	$F_{2 c}$	$F_{2 (c + 1)}$	$\dots$	$F_{2 n}$
Enrichment in list 1	0	$\dots$	0	1	$\dots$	1	0	$\dots$	0	1	$\dots$	1
Enrichment in list 2	0	$\dots$	0	0	$\dots$	0	1	$\dots$	1	1	$\dots$	1

In the latter test, the annotation frequencies are substituted by 0 and 1 (i.e., “non-enriched” and “enriched” GO term.) and if the test is based on the Sorensen–Dice similarity, the first set of GO terms (non-enriched in both lists) is ignored. The GO terms are arbitrarily ordered: from left to right, first there are all those non-enriched in both lists ( $n_{00}$ in total), next those enriched in the first list but not in the second one ( $n_{10}$ ), then those enriched in the second list but not in the first ( $n_{01}$ ) and finally those GO terms enriched in both lists ( $n_{11}$ )