Validation of our tumorigenesis-specific hypermethylated DMRs. Findings were validated against publicly accessible Illumina 450 K microarray datasets. a. The GEO dataset GSE48684 published by Luo et al. [72]. Left: Heatmap of the 89 female colorectal tissue samples included in the dataset. Each row corresponds to one of our tumorigenesis-specific hypermethylated DMRs containing at least 1 of the Illumina microarray probes (total: 5329 of the 5521 hypermethylated DMRs we classified as highly tumorigenesis-specific; see Results). The beta value (from 0, blue, to 1, yellow) reported for each DMR is the average beta across all the microarray probes within our tumorigenesis-specific DMR coordinates (details in the Methods section). Our tumorigenesis-specific DMRs displayed hypermethylation in the tumours (vs. normal mucosa) studied by Luo et al. Metadata available for the GSE48684 dataset includes tissue (normal-H: normal mucosa samples from patients with no history of CRC; normal-C: normal mucosa samples from patients with concurrent CRC; adenoma: cADN; cancer: CRC) and colon segment (Right, Proximal and Transverse: proximal colon; Left and Distal: distal colon). Donor ages were not reported. Right: ROC curves showing the high accuracy of tumour classification as cADN (AUC: 93.8%) or CRC (AUC: 93.2%) based on the median of DMR beta values (optimal cut-off: 0.2; specificity 100%, sensitivity 86.2% for cADNs; specificity 100%, sensitivity 85.4% for CRCs) as described in the Methods section. AUC, sensitivity (TPR) and specificity (1-FPR) for each tumorigenesis-specific DMR are also shown (colour annotation on the right side of the heatmap). b. The GEO GSE131013 dataset published by Díez-Villanueva et al. [73]. Left. Heatmap of the 78 female colorectal tissue samples in this dataset based on the beta values for 5322 of our highly tumorigenesis-specific hypermethylated DMRs (as described for panel A). Our tumorigenesis-specific DMRs displayed hypermethylation in the tumours (vs. normal mucosa) studied by Díez-Villanueva et al. Metadata available for the Díez-Villanueva dataset includes age (young: <40 [1 donor]; middle-age: 40–70, 48 women; old: >70, 29 women), tissue (Mucosa, normal mucosa sample from healthy donors; Normal, normal mucosa sample adjacent to CRC; Tumour, CRC) and colon segment (Left, distal colon; Right, proximal colon). Right: ROC curve analysis of the accuracy of the median of DMR beta values (optimal cut-off 0.2; specificity 92.3%, sensitivity 73.1%) in predicting tissue type (correct classification of CRC – AUC: 84.5%). AUC, sensitivity (TPR) and specificity (1-FPR) for each tumorigenesis-specific DMR are also shown (colour annotation on the right side of the heatmap).