Skip to main content
. 2014 Jul 10;10(7):e1003709. doi: 10.1371/journal.pcbi.1003709

Figure 2. Datasets and analysis strategy used.

Figure 2

A) Distribution of samples according to dataset (LBC1, CC, ART, LBC2) and disease stage in cervical carcinogenesis. Datasets LBC1 and CC were used for training, i.e. the DNB algorithm was applied to these sets only to infer candidate DNB modules. Datasets ART and LBC2 were used to test the predictions of the module scores obtained in the training data. B) The overall analysis strategy was to use LBC1 and CC as training sets, to infer candidate DNB modules across the 3 main stages of cervical carcinogenesis: normal, CIN2+ (cervical intraepithelial neoplasia of grade 2 or higher) and invasive cervical cancer, as shown. After computation of the relevance scores, measuring the strength of covariation in DNAm, of the inferred modules, we identified a candidate DNB(s) as the one exhibiting a maximum in the score in a stage (CIN2+) prior to invasive cancer. Finally, for this DNB module, we compute its score in independent data sets profiling samples from a previously considered disease stage (i.e. LBC2 for CIN2+) or from other intermediate disease stages (e.g. ART for normal HPV−, normal HPV+, precursor CIN2+ HPV− and precursor CIN2+ HPV+ cells). Prediction is that the scores in the CIN2+ LBC2 samples should agree with those of the CIN2+ LBC1 samples, and that the score values in disease stages N(HPV+), pre-CIN2+(HPV−) and pre-CIN2+(HPV+) should be intermediate between N(HPV−) and CIN2+.