Skip to main content
. 2016 May 23;8:60. doi: 10.1186/s13148-016-0227-0

Fig. 4.

Fig. 4

Pattern of differences between the measurements of DNA methylation using the LINE-1 and the LUMA based bioassays. Top: distribution of the differences observed in a total of n = 238 samples. Single differences are shown as colored dots matching the three data subsets (Table 1). The density distribution is presented as probability density function (PDF), estimated by means of the Pareto density estimation (PDE [63]; black line). A Gaussian mixture model (Eq. 1; GMM) was fit (red line) to the data, for which the number of mixes was M = 3 (blue dotted lines). The Bayesian boundaries between the three Gaussians are indicated as magenta vertical lines. Middle: mosaic plot showing the unequal distribution (χ 2 test: p < 2.2 × 10−16) of the data subset specific interassay differences (ordinate) among the three Gaussians (abscissa). The width of each cell is proportional to the number of measurements it comprises. Bottom: decision-tree showing the hierarchical criteria of assignment of an interassay difference to a Gaussian based group based on the originating tissue, i.e., data subset. The derived algorithm associated the majority of data from MCF7 cells, SHSY5Y cells, or blood cells to different Gaussians in the form of the following: “If the analyzed tissue consists not of cell lines (MCF7, SHSY5Y), then the LINE-1-LUMA differences belong to Gaussian 3 (counted from left to right refer to Fig. 4), and else, if the cell line is MFC7, then the differences belong to Gaussian 1, else they belong to Gaussian 2.” The model provided correct assignment at a cross-validated accuracy 83.6 %. Three numbers in the middle of the nodes display the proportion of single interassay differences in that node that really belonged to Gaussian #1, #2, or #3. At the bottom of each node is the percentage of data belonging to this node from all data (rounded to integer). The plot of the tree was obtained using the “fancyRpartPlot” function of the R package “rattle” (G. Williams; https://cran.r-project.org/web/packages/rattle/index.html [92])