Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2021 Sep 17;17(9):e1009327. doi: 10.1371/journal.pcbi.1009327

Hierarchy and control of ageing-related methylation networks

Gergely Palla 1,2, Péter Pollner 1,2,*, Judit Börcsök 3, András Major 4, Béla Molnár 5, István Csabai 4
Editor: Ilya Ioshikhes6
PMCID: PMC8480875  PMID: 34534207

Abstract

DNA methylation provides one of the most widely studied biomarkers of ageing. Since the methylation of CpG dinucleotides function as switches in cellular mechanisms, it is plausible to assume that by proper adjustment of these switches age may be tuned. Though, adjusting hundreds of CpG methylation levels coherently may never be feasible and changing just a few positions may lead to biologically unstable state.

A prominent example of methylation-based age estimators is provided by Horvath’s clock, based on 353 CpG dinucleotides, showing a high correlation (not necessarily causation) with chronological age across multiple tissue types. On this small subset of CpG dinucleotides we demonstrate how the adjustment of one methylation level leads to a cascade of changes at other sites. Among the studied subset, we locate the most important CpGs (and related genes) that may have a large influence on the rest of the sub-system. According to our analysis, the structure of this network is way more hierarchical compared to what one would expect based on ensembles of uncorrelated connections. Therefore, only a handful of CpGs is enough to modify the system towards a desired state.

When propagation of the change over the network is taken into account, the resulting modification in the predicted age can be significantly larger compared to the effect of isolated CpG perturbations. By adjusting the most influential single CpG site and following the propagation of methylation level changes we can reach up to 5.74 years in virtual age reduction, significantly larger than without taking into account of the network control. Extending our approach to the whole methylation network may identify key nodes that have controller role in the ageing process.

Author summary

Aging affects all living organisms. In humans, the chronological age correlates with the methylation level of some locations of the DNA. Here we extract an interaction network between these ageing related sites, which shows signs of hierarchical organisation. In addition, modifications in the methylation of single sites of the DNA can impose cascades of changes at other sites over this network. Based on “gedanken-experiments” in a small subset of CpG sites we show that by tuning appropriately selected methylation levels the estimated biological age can be changed. When modifying the most influential locations, the resulting cascades of changes can set back the estimated biological age by more than 5 years. Our study also shows that compared to single site methylation perturbations, the propagation of the change over the interaction network leads to methylation change profiles which are more aligned with the natural direction of ageing in a high dimensional representation of the methylation levels.

Introduction

An ancient desire of humanity is to understand, slow, or even halt and reverse ageing. In related studies it was soon realised that certain biomarkers can rather precisely predict the functional capability of tissues, organs and even patients [1, 2]. In addition, age-related biomarkers enable the introduction of biological age [3, 4], bringing additive information in the risk assessments for age-related conditions on top of chronological age. Individuals of the same chronological age can still show great heterogeneity in the tissue and organismal functions, and thus, could possess different risks for age-associated diseases as judged from their biological ages. However, the predictive value of biological age estimators is usually decreasing at old age due to the increased biological heterogeneity among elderly individuals [3].

Probably the most promising age-predictive biomarkers are the ones based on DNA-methylation [57], which can be used for basically any source of DNA from sorted cells through tissues to organs, and can predict the biological age across the whole life span from prenatal tissues to tissues obtained from centenarians [5]. DNA methylation-related markers are also important in endocrinology [8], cell biology [9], biodemography [10], lifestyle factors [11], and medicine [12].

The research of DNA methylation dates back to the 1960s, when it was first observed that the methylation level of the CpG dinucleotides in the DNA is changing genome-wide with the chronological age [13, 14]. Later on, thanks to the developments in methylation array technologies, specific CpG dinucleotides were located in the genome, based on which the age of the DNA source (e.g., a tissue, an organ, or a person) can be estimated [1522]. In fact, age related change in DNA methylome turns out to be quite common, where up to 15—30% of all CpG sites is displaying changes of certain types related to ageing. The involved change can be random fashion due to epigenomic drift [23] directional, or show increased variability with age [24]. The research on DNA-methylation related biomarkers has been a huge success [25], with still some important challenges remaining, such as the dissection of the regulators and drivers of age-related changes in single-cell, tissue- and disease-specific models, the analysis of further epigenomic marks, the implementation of longitudinal and diverse population studies, and the exploration of non-human models [26].

In general, when the goal is to estimate the age based on the methylome, the usual framework couples a set of CpGs with a mathematical algorithm, where the observed methylation levels of the CpG dinucleotides are combined in some way to yield the estimated age in years [25]. The obtained estimated age is referred to as DNAm age, or epigenetic age, which is highly correlated with the chronological age, but is also effected by other biological factors such as the health status as well. The above mentioned DNA methylation-based age estimators are usually built using supervised machine learning techniques such as penalized regression models, which automatically select the most informative CpGs for the age estimation [25]. The first DNA methylation-based age estimators in the scientific literature were concentrating on a single tissue [27, 28], and therefore, were tailor-made for just one type of DNA source, leading to biased estimates for other tissues. However, the construction of multi-tissue DNA methylation-based age estimators is non-trivial, due to the significant differences between the DNA methylation patterns among different tissues [29, 30], the specific ways in which the DNA methylation patterns change with age across the different cell types [17, 31], the fact that distinct biological processes drive the observed age-related hypermethylation and hypomethylation, and that baseline DNA methylation state is strongly driven by genetics being highly CpG density dependent [32].

Nevertheless, after the discovery of CpGs that show age related changes in a diverse range of tissues [20], Steve Horvath proposed the first multi-tissue DNA methylation-based age estimator, which is often referred to as Horvath’s clock [5]. This estimator is based on elastic net regression, and was trained and validated using 8,000 microarray samples from over 30 different tissue and cell types collected from patients of age ranging from children to adult, selecting altogether N = 353 CpGs from the overall 27k CpG dinucleotides in the data. According to the results of the elastic net regression method, by taking the measured methylation levels mi(q) for patient q at the i-th CpG dinucleotide, the estimated age a(q) can be given as

a(q)={(at+1)i=1NHimi(q)+atifi=1NHimi(q)0,exp(i=1NHimi(q)+ln(at+1))-1ifi=1NHimi(q)<0, (1)

where the adult age threshold at is equal to at = 20, and Hi is corresponding to the coefficient of the i-th CpG dinucleotide. Over the years, Horvath’s clock turned out to be a very successful age estimator, providing accurate results using various DNA sources across the entire human lifespan [25], although some caveats still remain [33, 34]. In a recent study, together with similar clocks derived by Hannum et. al [28], Levine et al. [35] and Lu et al. [36], Horvath’s clock was used to measure the impact of a protocol intended to regenerate the thymus, where the mean epigenetic age was 1.5 years less than baseline after 1 year of treatment [37].

The estimation of age based on DNA methylation across multiple tissues is indeed a complex problem. E.g., in case of Horvath’s clock, for 193 of the included CpGs the methylation state is positively correlated with age, whereas for the remaining 160 CpGs we can observe a negative correlation [5]. Furthermore, when considered individually, the methylation state for most of the CpGs is only weakly correlated with age, e.g., the strongest and most robust individual CpG pan-tissue changes from the ELOVL2 locus [38, 39] were not included in the clock. Thus, these were selected not for their individual strength, but rather to their power to work collectively to parsimoniously capture ageing over the life-course [25]. This also means that we cannot really point out any of these CpGs as being more important than others for measuring the molecular age. These properties are in full consistency with the way the elastic net regression method is selecting among the feature variables by penalising the coefficients both in quadratic and absolute forms. In this approach by combining the penalty of the summed absolute value of the coefficients from Lasso regression (that is known to turn the coefficients of unimportant variables to zero) with the quadratic penalty of the coefficients from Ridge regression we obtain a convex loss function with a unique minimum, making the selection more stable.

When considering the modifications to the methylation profile during the life-course, the known ageing effects are leading to coordinated changes across the entire DNA methylome, including those driven by cell-type specific epigenomics, where changes in cell proportion will led to variation (including the age-related myeloid skew [40], T cell exhaustion [41]), polycomb target hypermethylation [20], bivalent domain hypermethylation [42], etc. Such systemic effects can be seen as networks of age-related change, where the methylation level change of any CpG is accompanied by changes in the levels of related other CpGs as well.

In the present paper we examine this network of connections between the CpG dinucleotides of Horvath’s clock. The tool we use to reveal the links is given by Lasso cross-validation (Lasso-CV) regression, which is a simple and robust approach to extract the most relevant inter relations between a given outcome (or response variable) and a larger set of regressors (or feature variables). Due to the complex nature of the problem, instead of focusing solely on the revealed pairwise interactions, we gather the obtained connections (links) into a methylation network, and analyse the properties of the system using techniques from complex network theory.

The network approach for studying the structure and dynamics of complex systems has become ubiquitous over the last two decades, and studies of networks ranging from gene interactions to the level of the society have shown that the statistical analysis of the underlying graph structure can highlight non-trivial properties and reveal previously unseen relations [4346]. In the present work our network analysis is focusing on the hierarchical and control properties of the web of connections between the CpG dinucleotides.

Signs of hierarchical organization were observed in complex networks of diverse types [47], ranging from transcriptional regulatory networks within cells [48], through flocks of various animal species [49, 50], to the level of on-line news content [51], scientific journals [52], social interactions [5355], ecological systems [56], and evolution [57]. In a hierarchical network, nodes close to the top of the hierarchy usually have a larger influence compared to nodes at the bottom levels. One of the related questions we address in this paper is whether we can detect signs of hierarchy in case of the methylation network too, and if so, which CpGs are on the top of the hierarchy?

Besides the hierarchical properties, another important aspect we investigate is given by the control properties of the network. The control theory of networks is based on the framework of structural controllability of linear dynamical systems [58], exploiting the connections between graph combinatorics and linear algebra. The importance of the nodes from the point of view of control can be characterised by the control centrality [59], corresponding to the number of nodes we can drive by controlling the given node using an external signal. By combining control centrality with the results from the hierarchy analysis, we can locate the CpGs having the largest influence on the rest of the system, which may also play a crucial role in the process of ageing.

In close relation to the above, we also introduce a simple framework, in which we can examine the effect of perturbing the methylation level of CpGs on the DNAm age obtained according to Horvath’s clock. The basic idea is initiating a small change on the methylation of one CpG, and then propagate the effect over the methylation network according to the regression coefficients defining the link weights. This provides a minimal model for tracking the changes in the methylation profile, in which the complex interrelations between the CpGs are taken into account. In our view, treating the set of ageing-related CpGs as a complex, interacting system can provide more realistic change profiles in DNA methylation compared to isolated individual shifts in the methylation of single CpGs. Nevertheless, there can still be differences between the obtained shifts in the DNAm age depending on which CpG was chosen for the initiating perturbation, and naturally, one of the most interesting questions is for which initiating CpG do the methylation changes accumulate in such a way that the resulting shift in the DNAm age is maximal. A flow chart summarising the investigations carried in the paper are given in Fig 1.

Fig 1. Flow chart of our analysis.

Fig 1

a) Our study is based on cytosine methylation, a phenomenon where a methyl group is attached to a CpG dinucleotide in the DNA. b) We focus on the methylation level of the 353 CpG dinucleotides appearing in Horvath’s clock, using the data from Ref. [28], listing altogether 656 patients. c) By plugging in the methylation levels of a given patient into Horvath’s clock, we obtain the DNAm age, which is in strong correlation with the chronological age, but is also affected by e.g., the health status. d) Using Lasso-regression, we construct a methylation network between the CpG dinucleotides. In order to seek for key influential nodes in the system we analyse the hierarchical (panel e) and control properties (panel f) of the network. g) In addition, we also investigate how would the change of the methylation levels affect the estimated age when the perturbations are transmitted over the methylation network.

There is an important caveat to keep in mind when interpreting our results is that we demonstrate the hierarchy, control and perturbation of the methylation network only on a small, somewhat ‘artificially’ isolated subset of the orders of magnitude large set of all CpG dinucleotides of the human genome. To uncover biologically relevant factors the presented network control analysis should be extended to more sites and possible to other mechanisms beyond DNA methylation.

Results

Methylation network based on Lasso-CV regression

We constructed the methylation network between the 353 CpG dinucleotides of Horvath’s clock by repeatedly applying Lasso-CV regression for predicting the methylation level of one of the CpG dinucleotides based on the methylation level of the 352 other positions in the data set from Ref. [28] (more details on the data set are given in Methods). In this approach, the methylation level mj of CpG dinucleotide j can be modelled as

mj=i=1ijNβjimi+β0, (2)

where βji denote the Lasso regression coefficients (with β0 corresponding to the intercept). According to this model, the weight (or strength) wij of a directed link from CpG dinucleotide i pointing to CpG dinucleotide j is given by wij ≡ |βji|.

The advantage of Lasso regression is that due to a factor α penalizing the L1-norm of the vector of the regression coefficients, a fraction of the coefficients becomes exactly zero, resulting in a model that is more easy to interpret. For choosing the right value of α at each CpG dinucleotide, we used the well known method of cross-validation, more details on the applied Lasso-CV method are given in Methods. In order to make the resulting network sparse, we further applied a weight threshold w* on the link weights wij, and neglected connections where wij was smaller than w*. The optimal value of w* was set according to a general link weight thresholding method for biological networks based on the concept of efficiency [60], more details of this approach are given in Methods.

Hierarchical properties according to m-reach

In order to study the hierarchical properties of the methylation network we use the concept of the Global Reaching Centrality [61], corresponding to a hierarchy measure that turned out to be successful in quantifying the extent of hierarchy in the structure of directed networks [52, 55, 61]. The basic idea behind this approach is that in a hierarchical network it should be easy for leaders (at the top of the hierarchy) to send orders or instructions to other nodes over relatively short paths, whereas this is not the case for nodes at the bottom of the hierarchy. Therefore, by comparing the fraction of nodes reachable in at most m steps from a given node (called as the m-reach of the node) we can judge which nodes should be positioned high in the hierarchy, and which nodes are supposed to be at the lower levels.

In addition, based on the inhomogeneity of the m-reach distribution we can also quantify the extent of hierarchy in the organisation of the network. Technically this is done by using the Global Reaching Centrality denoted by GRC(m), corresponding to the average difference between the individual m-reach of the nodes and the maximal m-reach value in the network (more details are given in Methods). In Fig 2 we show the GRC(m) at m-values ranging from m = 2 to m = 5, where the value observed in the methylation network is compared to the empirical probability density of the GRC(m) values measured in random directed networks with the same degree sequence. According to the results, the GRC(m) distribution for the random network ensemble shows a bell shaped curve with a well defined average for all m parameter values. The GRC(m) measured for the original methylation network is way larger than this average, and the relative difference (measured in the units of the standard deviation σ) seems to be increasing with m. This means that the organisation of the network is far more hierarchical compared to what we would expect in a random network with the same degree distribution.

Fig 2. Hierarchy of the methylation network.

Fig 2

We show the GRC(m) measured for the network (red) together with probability density ρ(GRC) of the corresponding values in a link randomised ensemble of 20,000 networks (blue) at m = 2 (panel a), m = 3 (panel b), m = 4 (panel c), and m = 5 (panel d).

Based on the m-reach, following the method suggested in Ref. [61], we can also apply a hierarchical layout to the network, as shown In Fig 3. Since higher m-reach means larger influence and therefore, higher position in the hierarchy, we can sort the nodes into levels according to their m-reach. In parallel, nodes in the same level are intuitively expected to have a similar influence, thus, it is quite natural to define the levels by grouping the nodes in such a way that the difference between the m-reach for any two members of a given level is below a certain threshold. The considerable number of links pointing upwards in the hierarchy highlights that the network is hierarchical in a non-trivial manner, i.e., its structure is far from a tree or a directed acyclic graph. However, in the mean time, the m-reach of the top nodes is still far larger compared to the average.

Fig 3. Hierarchical layout of the network.

Fig 3

The nodes are sorted into the levels according to their m-reach at m = 3, where the levels are defined such that the difference between the m-reach for any pair of nodes in the same level is less then one-tenth of the standard deviation of the m-reach distribution. The shade and size indicates the m-reach, and the panel on the right shows the top of the hierarchy zoomed in. The m-reach of the two nodes at the top layer (CATSPERG and PAPOLG) is close to rm=3 = 0.95, thus, they can reach about 95% of the other nodes at most in 3 steps.

Control centrality analysis

The control properties of the network can also be of high interest. Here we study the relative control centrality c(i), which for any node i is given by the maximal fraction of nodes we can drive by controlling i. (More details are given in Methods). In Fig 4 we provide a scatter plot of the m-reach at m = 3 and c(i). The top panel of the figure displays the probability density ρ(c) for c, consisting of basically two narrow peaks for the methylation network at the optimal link weight threshold, shown by blue colour. In order to allow a finer distinction between the control abilities of the nodes, we calculated the control centrality for a larger number of networks obtained at link weight thresholds w* different from the optimal one, ranging in an interval between w* = 0.04 and w* = 0.1. By taking the average of c(i) for a given node i over these networks we obtain a quantity characterising its potential for driving other nodes in not a single instance of the methylation network, but rather, across the entire collection of networks. According to the top panel of Fig 4, the probability density ρ(c) for the averaged control centrality (orange colour) has a far more complex structure compared to the ρ(c) of the methylation network at the optimal w*, and therefore, allows a finer ranking between the nodes.

Fig 4. Control centrality and reach.

Fig 4

The main panel shows the reaching centrality rm at m = 3 as a function of the relative control centrality c. Each symbol in the plot is corresponding to an individual CpG dinucleotide (node in the methylation network). In blue we show the results for the methylation network at the optimal weight threshold w*, whereas in case of the orange symbols C(i) was averaged for the individual nodes over 60 different networks obtained by changing the w* parameter in the [w* = 0.04, w* = 0.1] interval. The Pearson correlation coefficient between 〈c〉 and rm is 0.41 for the optimal network, and 0.70 in case of the averaging scenario. The top panel displays the density of the normalised control centrality c for the two cases.

Interestingly, the nodes that are important from the point of view of control, are usually also in high position in the hierarchy, as indicated by the scatter plot in the main panel of Fig 4. In case of the network at the optimal weight threshold w*, the nodes with zero control centrality have also zero m-reach (set of blue points at the origin), whereas rm for nodes with high c value can range practically between zero and one (set of blue points forming a vertical line on the right). In case of the c averaged over the networks obtained at different weight thresholds (orange colour), the bulk of the point cloud shows an increasing tendency, whereas we can also see a narrow line of points with zero reach in the low c regime.

Modifying the predicted age by perturbing the methylation network

A natural question arising during the analysis of the methylation network is how do the perturbations of the methylation levels affect the estimated age, and which are the CpG dinucleotides where the observed sensitivity in the predicted age is maximal for small perturbations? In the simplest scenario we can consider perturbing the methylation levels of the individual CpG dinucleotides one by one, without taking into account any possible propagation of such effect over the methylation network. Since the vast majority of the patients in the used data are older than the adult age threshold at appearing in Eq (1), for simplicity let us assume that the estimated age is calculated according to the first equation in (1), which is linear in both Hi and mi. Thus, if we can change one and only one mi value, then the largest effect for a unite change in mi is expected for the CpG dinucleotide i where |dadmi|=(at+1)|Hi| is maximal.

However, perturbing the methylation level of a single CpG is quite likely to induce changes in the methylation of further other CpGs as well. A living tissue is reacting to outside perturbations, and these reactions plausibly affect the methylation of the related other CpGs. Since the links of the methylation network encapsulate the most relevant linear connections between the methylation levels, using these we can examine the expected propagation of the change in the overall methylation pattern. To keep our framework simple, we assume that initially the methylation level of a single CpG dinucleotide is changed as mimi + Δmi due to some outside perturbation, and then this change triggers further modifications in the methylation of other CpG dinucleotides as well. In general, we may take into account chains of interactions up to some maximal length max. When max = 0, we actually have only single node perturbation, max = 1 is corresponding to taking into account the first neighbour interactions, max = 2 includes also the 2nd neighbour interactions, etc.

According to Eq (2), a Δmi perturbation of the methylation level of CpG dinucleotide i is inducing a change in the methylation level of node j given by Δmj = βjiΔmi. Therefore, if we take into first neighbour interactions (max = 1), for the derivative of the estimated age with respect to mi we can write

1at+1dadmi=Hi+jLioutHjβji, (3)

where Liout denotes the set of out-neighbours for i. When including longer chains of interactions in the network (max > 1), due to the linear nature of the problem, the only modification to the above is that we have to multiply the β coefficients along the considered paths, yielding

1at+1dadmi=Hi+jHjβji1stneighs.+jkHjβjkβki2ndneighs.+jksHjβjsβskβki3rdneighs.+=Hi+jHj(βji+kβjkβki+ksβjsβskβki+)=Hi+jHjuD(i,j)q=i(u)+iβq+1,qeffectiveβji, (4)

where D(i,j) denotes the set of allowed paths between i and j, the length of a path uD(i,j) is given by (u), and the term q=i(u)+iβq+1,q is simply the product of the Lasso regression coefficients along u. The result has a very similar form compared to (3), and can be interpreted as that the longer chains of interactions introduce an effective βji, corresponding to the sum of the products of the original β values along the considered paths between i and j.

Let us now move from the derivative of the estimated age to the expected actual change in a, which is given by the product of the age derivative given in (4), and Δmi as

Δa=dadmiΔmi. (5)

In order to keep the framework realistic, let us assume that Δmi is of the order of magnitude given by the deviations we see in the data, thus, we set Δmi ≡ 2〈σ(m)〉, where 〈σ(m)〉 denotes the average of the standard deviation of the methylation levels among the different CpGs.

We have calculated the derivative of the estimated age and the corresponding corresponding Δa for each node in the methylation network using parameters between = 0 and = 1 (the distribution of |Δa| is shown in Fig A in S1 Text). The average for |Δa| was observed to be 〈|Δa|〉 = 0.40 for the max = 0 case, and 〈|Δa|〉 ≃ 0.63 for the max = 4 scenario, meaning that the perturbation of the individual methylation levels has a larger effect on the estimated age if the chains of interactions between the CpGs are taken into account.

Besides leading to a larger effect in the change of the estimated age, the scenario where we propagate the perturbation of the methylation levels over the network seems to induce change patterns that are more aligned with plausible changes in the methylation profiles due to ageing. To illustrate this, we define a high dimensional representation of the methylation profiles given by a Euclidean space where each dimension is corresponding to a CpG dinucleotide, and the methylation levels define the coordinates according to the corresponding axes. In our study the complete ‘methylation space’ is narrowed down to 353 dimensions, corresponding to the CpGs in Horvath’s clock selected by elastic net regression, and the patients in the data set form a cloud of 656 points in the space.

Let us denote the vector pointing from the origin to patient q as m(q), where the component i of m(q) is simply given by mi(q). We can also define a vector H for which the component i is equal to the coefficient Hi appearing in (1). This way, the estimated age a(q) for patients above the adult age threshold (top line in Eq (1)) can be also writen as a(q) = (at + 1)m(q) ⋅ H + at, where m(q) ⋅ H is corresponding to the scalar product (inner product) of the corresponding vectors. In Fig 5 we show a 2 dimensional projection of the point cloud of the patients, where the horizontal axis is corresponding to the component of m(q) pointing in the direction of H (given by m(q)·H|H|), and the vertical axis is displaying the distance d(q) from the centre of mass of the point cloud in the hyper-plane perpendicular to H. The colouring of the nodes indicates the chronological age of the patients, and according to the figure, we can observe a clear correspondence between the chronological age and the estimated age based on Horvath’s clock (which is proportional to the horizontal coordinate of the point).

Fig 5. The point cloud of patients in the space of methylation coordinates.

Fig 5

The horizontal axis is showing the projection of the patient vectors m(q) onto H (corresponding to a vector composed of the coefficients of Horvath’s clock), whereas the vertical axis is showing the distance d(q) from the center of mass in the hyper plane perpendicular to H. The colours represent the chronological age, with dark, colder shades indicating younger patients, and bright, warm shades signalling older patients. At the top of the figure we list the average age for the patients in the bins indicated by the dotted vertical lines. The left inset shows the displacement due to perturbing the methylation level of GAP43, with the purple node corresponding to the max = 0 case, and the green node indicating the max = 4 case. The right inset is showing the analogous results for perturbing the methylation level of SCGN.

The perturbations of the methylation profile we considered earlier can also be interpreted as vectors pointing in a certain direction in the methylation space, e.g., perturbing the methylation level of just a single CpG dinucleotide can be represented by a vector pointing in the direction of the corresponding base vector. For example, in the insets of Fig 5 we show the results when modifying the methylation level of GAP43 (left inset) and of SCGN (right inset) by 2〈σ(m)〉 for two randomly chosen patients. The nodes marked by the red perimeter correspond to the chosen patients, and the purple nodes represent the new position when only the methylation level of GAP43 or SCGN are modified (the max = 0 case). According to the figure, the displacement for GAP43 (left inset) is rather small, and seems to be perpendicular to H, whereas in case of modifying the methylation level of SCGN (right inset), the displacement is larger, and is in good alignment with H. However, when we take into account the propagation of the perturbation over the methylation network, the length of the displacement for GAP43 is increased, and it is also much more aligned with H, as indicated by the green node in the left inset. In parallel, for SCGN the displacement remains aligned with H when switching from max = 0 to max = 4, and its length is nearly doubled (green node in right inset). Similar effects can be observed for the other CpGs as well, for example the angle between the direction of change obtained at max = 4 and H is smaller for 296 out of the 353 CpG dinucleotides compared to the angle between H and the corresponding single CpG change direction. These examples suggest that taking into account the propagation of the perturbation over the methylation network, the induced change is more aligned with the ‘natural direction of ageing’.

In addition, let us also examine the relation between |Δa| and the previously analysed topological characteristics of the CpG dinucleotides. In Fig 6 we re-plot the hierarchy according to the m-reach (shown in Fig 3), however this time the size and colouring of the nodes is indicating the expected change in the estimated age, |Δa| if we perturb the methylation level of the given node according to (4) and (5) at max = 4. The nodes with higher |Δa| values tend to be placed higher in the hierarchy, however with also a strong variation among CpGs with similar |Δa| values. E.g., BAZ2A, UCKL1 and AGBL5 from the top 5 nodes according to |Δa| are very high up in the hierarchy, whereas SCGN (having the largest |Δa| at max = 4 amongst all nodes) is at a relatively low level compared to them. Nevertheless, the position of a node in the hierarchy and its potential for inducing a large change in the estimated age are clearly interrelated.

Fig 6. Top levels of the hierarchy according to m-reach at m = 3.

Fig 6

The shading of the nodes indicates their estimated age reduction value |Δa| (with darker shades corresponding to higher values).

Finally, in Fig 7 we show the scatter plot of |Δa| at a maximal path length of max = 4 as a function of the m-reach rm and the average control centrality 〈c〉 of the node where the initial perturbation is applied. According to the figure, a moderate increasing tendency can be observed in the behaviour of |Δa|, which means that nodes with larger m-reach and/or higher control centrality are good candidates for achieving a notable change in the estimated age a when perturbing the methylation level of the corresponding CpG dinucleotide.

Fig 7. Scatter plot of expected change in the estimated age as a function of the the m-reach and the control centrality.

Fig 7

The vertical axis is corresponding to |Δa|, calculated according to (5), where the initial perturbation Δmi on the methylation level is equal to 2〈σ(m)〉 for all i, and the age derivative for node i is obtained from (4) at max = 4. On the x axis we display the m-reach rm at m = 3, whereas the y axis is corresponding to the average control centrality 〈c〉. The colouring of the symbols follows their vertical coordinate, with bright colours corresponding to low |Δa| values, and darker shades representing a larger expected change in a.

Extension of the analysis to further methylation networks

Although the results shown so far indicate that the methylation network of Horvath’s clock is displaying interesting hierarchical and control properties, a question of key importance is how do these features generalise for different sets of CpGs, or for even the entire 450k set of CpGs in the input data. Relating to that we have examined methylation networks corresponding to both alternative epigenetic clocks and to sets of randomly chosen CpGs.

According to our results detailed in Section 2 in S1 Text, for the Skin-blood clock (consisting of N = 391 CpGs) and for Hannum’s clock (having N = 71 CpGs) the networks obtained by applying the methodology presented here show a very similar behaviour compared to the methylation network of Horvath’s clock. First, the GRC of these two networks is significantly higher than the average GRC in random graph ensembles with the same degree distributions as indicated by Figs B and C in S1 Text. Second, the control centrality of the nodes is in positive correlation with their hierarchy position determined by the m-reach in both systems, as shown by Figs D and E in S1 Text. Finally, when applying the framework of methylation level perturbations governed by Eqs (3)(5), the expected change in the estimated age is usually higher than average if the perturbed node is chosen from the top part of the hierarchy with larger control centrality (Figs F and G in S1 Text), again, in a similar fashion to what we have demonstrated for the methylation network of Horvath’s clock here.

In parallel with the Skin-Blood clock and Hannum’s clock, we also examined the properties of methylation networks where random sets of CpGs (with equal size to the set of CpGs in Horvath’s clock) were selected from the available 450k CpG in the data. Based on the results shown in Section 3 in S1 Text, these networks are also more hierarchical compared to their link randomised counterpart, although the difference between the average GRC value of the link randomised networks and the average GRC of the original networks is not as large (in the units of the standard deviation of the link randomised ensemble) as for the case of Horvath’s clock. The control centrality is again in positive correlation with the node position in the hierarchy determined by the m-reach, as indicated by Fig H.

The above results suggest that methylation networks obtained in our approach (based on regularised regression between the methylation levels) show hierarchical properties in general, that are positively correlated with the control centrality of the nodes. However, a remaining question of key interest is whether the nodes observed to be at the top of the hierarchy for small methylation networks could play a leading role also when considering e.g., the entire web of connections between the 450k CpGs in the data? Unfortunately, the scaling up of our approach to this size level is computationally unfeasible, nevertheless, we can still examine to what extent is the hierarchy position of certain nodes conserved when the set of CpGs defining the network is varied as follows. We have taken the top 10% of the nodes from the hierarchy seen for Horvath’s clock (Fig 3), and “mixed” them together with randomly chosen CpGs from the 450k methylation array, forming networks of the same size (353 nodes) as in the original case of Horvath’s clock. By applying the same methodology, we can examine the hierarchical properties of these mixed networks as well, with a special focus on the position of the nodes that were at the very top in case of the hierarchy corresponding to Horvath’s clock.

In Fig 8a we show the hierarchical layout of 50 mixed networks plotted on top of each other, where the CpGs from the original network based on Horvath’s clock are colored red. According to the figure, these nodes tend to be located at the higher levels in the mixed networks as well. In Fig 8b we display the analogous result when choosing the bottom 10% of the nodes from the hierarchy obtained for Horvath’s clock to be mixed with random CpGs. The layouts show that in such a case the CpGs from Horvath’s clock are located closer to the bottom of the hierarchy also for the mixed networks. These results indicate that the position of the nodes in the hierarchy is conserved to a considerable extent when the set of nodes defining the methylation network is varied.

Fig 8. Mixed hierarchies between randomly chosen CpGs where 10% of the nodes are from Horvath’s clock.

Fig 8

a) When choosing the top 10% of the nodes from the hierarchy for Horvath’s clock and mixing them with random CpGs, we obtain the hierarchies shown on the left, where 50 networks are projected on top of each other, with the nodes from Horvath’s clock coloured red. On the right we display the density distribution across the hierarchy levels in grey for all nodes and in red for the nodes from solely Horvath’s clock. b) The same as in panel a), but choosing the bottom 10% of the nodes from the hierarchy for Horvath’s clock.

Finally, we also examined the networks arising between the CpGs in Horvath’s clock when applying our pipeline to an alternative methylation input data set. As described in Section 4 in S1 Text, by analysing the data studied by Lehne et al. in Ref. [62] we arrived to qualitatively very similar results as in case of the present data set. Namely, the extracted network was significantly more hierarchical compared to its configuration model ensemble counter parts (Fig I in S1 Text), the control centrality and the m-reach were positively correlated (Fig J in S1 Text), and the best candidates for achieving a large change in the age predicted by Horvath’s clock via perturbing the methylation level were the CpGs with a high position in the hierarchy and a large control centrality (Fig K in S1 Text). Furthermore, in spite of the significant difference between the age distribution of the two patient cohorts (as shown by e.g., Fig L in S1 Text), the hierarchies obtained for the Lehne et al. datset and the Hannum et al. data set show a relatively high similarity (Figs M and N and O in S1 Text), marked by e.g., a C = 0.75 Pearson correlation coefficient between the m-reach calculated in the two methylation networks. This indicates yet again the robustness of our analysis framework, that is capable of extracting consistent hierarchies between the CpGs based on diverse input methylation data sets.

Discussion

According to our results, the methylation network between the 353 CpG dinucleotides of Horvath’s clock is showing non-trivial hierarchical and control properties. Nodes with high standing in the hierarchy and/or a large control centrality are also likely to have more potential for inducing a large change in the estimated age when their methylation level is perturbed. In Table 1 we list the CpG dinucleotides (genes) that are in the top 20 according to either the change in the estimated age, the m-reach (position in the hierarchy), or the average control centrality. We can observe a high number of overlapping genes (11) among the top 20 genes of m-reach and the top 20 genes of control centrality supporting the correlation between m-reach and control centrality in the network. In the subsection Biological role of top identified genes we shortly describe the role and function of a couple of the notable genes from the table, collected from the recent scientific literature.

Table 1. Top CpG dinucleotides according to our analysis.

We list the genes corresponding to the CpG dinucleotides that are in the top 20 according to either the change in the esitmated age, |Δa|, the m-reach, rm, or the average control centrality, 〈c〉. The background colouring of the cells indicates the relative magnitude of the given value compared to the other CpG dinucleotides.

gene a| r m c gene a| r m c
SCGN 5.74 0.48 0.71 PQLC1 1.56 0.84 0.8
BAZ2A 5.54 0.86 0.78 C16orf65-cg0 1.51 0.84 0.8
UCKL1 5.07 0.9 0.81 C3orf75-cg1 1.5 0.83 0.78
AGBL5 5.02 0.87 0.79 PAPOLG 1.5 0.94 0.8
CEBPD 4.09 0.62 0.79 KDM3A 1.3 0.71 0.8
RXRA 3.93 0.74 0.79 CXADR 1.29 0.67 0.8
C19orf30-cg0 3.93 0.54 0.71 GPR68 1.25 0.81 0.8
NHLRC1 3.78 0.29 0.67 DNASE2 0.98 0.88 0.8
PAWR-cg0 3.2 0.58 0.79 ABCA3 0.9 0.73 0.79
VGF 2.97 0.39 0.71 MN1 0.88 0.86 0.8
DPP8 2.87 0.5 0.71 MIB1 0.84 0.86 0.79
C21orf63 2.72 0.66 0.72 ELAC2 0.76 0.87 0.78
TNFRSF13C 2.45 0.66 0.79 C7orf44 0.66 0.84 0.79
TSSK6 2.43 0.71 0.78 LEPRE1 0.39 0.81 0.8
NR2F2 2.4 0.49 0.66 PPP1R16B 0.39 0.86 0.74
ABHD14A-cg0 2.32 0.37 0.71 ZHX1 0.31 0.81 0.79
CSNK1D 2.31 0.42 0.64 CYFIP1 0.24 0.84 0.81
AFF1 2.31 0.57 0.72 C7orf55 0.24 0.81 0.8
EIF3M 2.3 0.82 0.74 WFS1 0.21 0.76 0.8
C19orf30-cg1 2.07 0.39 0.71 BBS5 0.2 0.85 0.76
C14orf176 1.64 0.91 0.8 CATSPERG 0.16 0.95 0.8

The methylation sites and the related genes involved Horvath’s clock bear remarkable power for age estimation but that not necessarily means that they control the aging process. Our network analysis identified a hierarchical control structure, but keep in mind that this study was limited to the Horvath’s clock 353 CpG dinucleotides, so the role of the top genes’ may be less important in an absolute sense and more influential control genes may remain hidden. The system we examined can be viewed as a sub-graph of a much larger network, i.e., new version of methylation microarrays can measure metylation at 850,000 CpG dinucleotides and whole genome bisulfite sequencing may observe millions. Thus, it is quite plausible that if more and more CpG dinucleotides are involved in a similar network based study, the additional interactions with the rest of this larger system might change the roles of the nodes in the focus of the present paper.

Due to the huge computational requirements, the repetition of our analysis on e.g., the network between the 850,000 CpG dinucleotides of current microarrays was not feasible, however, extension of the work on larger CpG dinucleotid networks is subject of further research. In the mean time, we have also examined the methylation network corresponding to two further epignetic clocks, the Skin-Blood clock [33] (with roughly the same number of CpGs as Horvath’s clock) and Hannum’s clock (smaller than Horvath’s clock), as well as methylation networks consisting of randomly chosen CpGs with the same size as Horvath’s clock, and the methylation network for the CpGs in Horvath’s clock based on the data set studied by Lehne et al. in Ref [62]. According to the results detailed in the S1 Text, these networks display hierarchical and control properties quite similar to what we have observed for Horvath’s clock based on the data studied by Hannum et al. in Ref. [28]. In addition, the alternative hierarchy obtained from the data set studied by Lehne et al. showed a reasonably high similarity with the hierarchy described in Figs 2 and 3. These findings indicate that methylation networks obtained in our framework can be expected to be hierarchical (with coupled control properties) in general. Furthermore, our analysis of networks where the top (or bottom) 10% of the CpGs from the hierarchy of Horvath’s clock were mixed with randomly chosen CpGs revealed that the position of the nodes in the hierarchy is showing a considerable conservation when the constituents of the network are varied. Based on this it is quite plausible that the nodes found to be close to the top in small hierarchies may play a more important role than the average in larger methylation networks as well.

In conclusion, we analysed the methylation network between the CpG dinucleotides appearing in Horvath’s clock. The inference of the connections between the CpGs building up this network is based on that many ageing effects (such as the age-related myeloid skew [40], T cell exhaustion [41]), polycomb target hypermethylation [20], bivalent domain hypermethylation [42], etc) result in coordinated changes in the methylation profile. The links obtained in our approach can be interpreted as a simple linear fit of the correlated differences between the methylation profiles in the data. According to our analysis based on the m-reach and the GRC, the studied network is substantially more hierarchical compared to a random graph with the same degree distribution. In addition, the network displays interesting control properties as well, e.g., the nodes at the top of the hierarchy tend to have larger control centrality as well. We also studied the effect of methylation level perturbations on the estimated age in a framework where the perturbations were propagated over the methylation network. According to our analysis, the resulting modifications to the overall methylation pattern seemed to be more aligned with the natural direction of ageing in a high dimensional representation of the methylation state compared to isolated methylation changes. Furthermore, the network framework scenario can also provide a significantly larger change in the predicted age. Finally, the nodes with large potential for achieving a notable change in the predicted age are more frequent in the top part of the hierarchy according to m-reach, and seem to have a large control centrality as well. Thus, when perturbing the methylation network at nodes having an important role from the topological point of view, the resulting effect in the predicted age is likely to be more intense compared to the response obtained by modifying the methylation of ordinary nodes. These findings indicate that the network approach can bring new insight into methylation-related studies, providing a very interesting direction for further research.

Methods

Methylation data

Our study is based on the publicly available methylation data published by G. Hannum et. al in Ref. [28]. The authors used the data to build a DNA methylation-based age estimator, so they performed genome-wide methylomic profiling of a large number of individuals spanning a wide age range. The data set consists of samples from a healthy population of 426 Hispanic and 230 Caucasian individuals, aged 19 to 101 with a median age of 65 years. The study included 338 female and 318 male participants. The samples were taken as whole blood and processed according to the standard protocol of the Illumina Infinium HumanMethylation450 BeadChip, which quantifies DNA methylation levels as a fraction between zero and one (also known as beta value). The BeadChip measures the methylation levels in over 480k CpG sites, including the 353 CpG dinucleotides of Horvath’s clock. In our study we only used this subset of the methylation data as a 353 × 656 table, and additionally we extracted the age of the individuals from the metadata as a list of 656 integers. The complete methylation profiles and the metadata can be found in NCBI’s Gene Expression Omnibus (GEO) under accession number GSE40279.

Lasso-CV regression

When carrying out Lasso regression in general, the objective is to solve

minβ{1Ny-Xβ2+αβ1}, (6)

where y denotes the outcome (response variable), X is the matrix of the regressors (feature variables), β corresponds to the regression coefficients, α is a parameter, and ‖⋅‖1 and ‖⋅‖2 denote the L1 and L2 norms, respectively. The advantage of this approach is that due to the second term, a large fraction of the regression coefficients become exactly zero, leaving only the really relevant predictor variables in the game.

In order to obtain more reliable results, the general technique of cross-validation can be combined with the idea of Lasso, hence the approach is usually referred to as Lasso cross-validation regression. The basic idea is to distribute the data into a number of ‘folds’ at random, and carry out the above regression on each fold separately. The results are then tested on the whole data set, and the best fit is chosen as the final solution to the regression problem. In our studies we used a 10 fold cross-validation when applying Lasso regression to infer connections between the CpG dinucleotides of Horvath’s clock, corresponding to a standard choice for the number of folds.

Finding the optimal weight threshold based on network efficiency

As mentioned in Results, the methylation network obtained from Lasso-CV is still relatively dense (although a considerable part of the regression coefficients is zero), with the average degree taking a value of 〈k〉 = 97.6. In order to make the network sparser, a standard procedure in complex network theory was used to apply a link weight threshold w*, throwing away the weak connections where wij < w*, and to keep only the strongest, most relevant links. For choosing the optimal value for w*, we use the general method introduced in Ref. [60], designed to find the best link weight threshold in dense biological networks.

The key idea of this method is to monitor the changes in a quality function based on network efficiency as a function of the link density in the network, and choose the settings where this function is maximal. The quality function is given by

J=Eg+ElρL, (7)

where Eg and El denote the global- and the local efficiency of the network, and ρL is equal to the link density (ρL = L/[N(N − 1)], with L denoting the total number of links). The global efficiency is defined as the average of the inverse distance between all node pairs in the network [63], whereas the local efficiency is the average of Eg

Eg=1N(N-1)ij1dij, (8)

where dij stands for the length of the shortest path from i to j, and based on that, can take high values when the majority of the nodes are close to each other in the network sense. The local efficiency is given by

El=1Ni=1NEg(i), (9)

where Eg(i) denotes the global efficiency calculated for the sub-graph between the neighbours of node i (where node i itself is absent from the sub-graph).

In Fig 9 we show J as a function of w*, displaying a single maximum at w* = 0.641. According to that, we used this value for thresholding the connections in the methylation network.

Fig 9. Efficiency as a function of the weight threshold.

Fig 9

The efficiency J obtained from (7) as a function of the weight threshold w*. According to the plot, the optimal value of w* is at w* = 0.641.

Hierarchy and reach

To analyse the hierarchical properties of the obtained methylation network we can use the concept of the Global Reaching Centrality hierarchy measure [52, 55, 61], which is based on the reach of the nodes. The basic idea behind this approach is that for leaders at the top of a hierarchy it should be easy to reach the rest of the nodes via relatively short paths, whereas this is not the case for the bottom nodes. Thus, by comparing the fraction of nodes reachable in at most m steps from a given node (called as the m-reach of the node) we can more or less judge which nodes should be positioned high in the hierarchy (the ones with a large m-reach), and which nodes are supposed to be at the lower levels (the nodes with small m-reach values). In addition, based on the inhomogeneity of the m-reach distribution we can also quantify the extent of hierarchy in the organization of the network. The Global Reaching Centrality is defined along this idea as

GRC(m)=1N-1i=1N[rm,max-rm(i)], (10)

where rm(i) is the m-reach of node i, and rm,max is the maximum value of the m-reach in the network. The original version of the GRC in Ref. [61] was defined using the total reach corresponding to m = ∞, however later it turned out that lower values of m might work better in a number of networks [52, 55]. A large value of the GRC(m) is obtained for inhomogenous m-reach distributions, indicating a strong hierarchical organization in the network structure, whereas low values of the GRC(m) usually corresponds to non-hierarchical networks where the m-reach of the nodes is more or less the same throughout the system.

When comparing the GRC(m) measured in the original methylation network to random networks with the same degree distribution, the basic idea is to take the configuration model [64] as a sort of ‘base line’, and examine whether we see deviations from this random graph base line in the real network. In the configuration model the a random graph ensemble is defined based on the degree sequence (corresponding to the list of degrees appearing in the network), and all possible realisations of graphs with the given degree sequence are considered equally probable. In practice, samples from the ensemble are generated by randomising the original network, where the degrees of the nodes are left intact. In our studies we used link randomisation, where in each step a pair of links are chosen at random, and one end on these links is swapped. The total number of randomisation steps was set such that the average number of rewiring per link was 10 for each random graph sample.

Network control and control centrality

In the linear control theory of networks we assume a time dependent state variable xi(t) assigned to each node i, that are governed by the differential equation

dxidt=j=1NAijxj(t)+q=1QBiquq(t), (11)

where Aij is corresponding to the adjacency matrix of the network, and Biq is an N by Q external input matrix, where we the input variables uq(t) can be chosen at will. The nodes actually receiving external input (for which at least one Biq is non-zero) are called as driver nodes. The system is controllable, if by appropriate choice of the input variables uq we can drive the system from any initial state to any desired state [6567]. One of the fundamental results of structural controllability is that a directed weighted network is controllable if and only if its unweighted counterpart is controllable [58] (except for ‘singular’ distribution of the link weights, which however can be considered as a zero measure set among all possible link weight assignments). Thus, when studying the control properties of a directed real network, we can assume all the non-zero weights to be equal to 1, thereby effectively turning the network into an un-weighted one, and concentrate solely on the network structure.

Structural controllability is very closely related to the matching problem in networks [58]. Matching in its most intuitive and original form is defined for bipartite graphs, where we have two node sets (e.g., ‘top’ and ‘bottom’ or ‘left’ and ‘right’), and links can only connect nodes from different sets. A matching is a subset of non-adjacent links, where link adjacency means a common end point. This non-adjacency property means that these links provide a unique one-to-one correspondence between the involved nodes (hence the name matching). In a perfect matching we can find a match for every node (where obviously, the two sets of nodes have to be of equal size to make this possible). In general we can look for the maximal matching, where the number of links in the matching is maximal. This is usually not unique, i.e., multiple different non-adjacent link sets may have the same maximal number of links. An efficient way to find a possible maximal matching of a bipartite graph is given by the Hopcroft-Karp algorithm [68].

The concept of matching can be extended from bipartite graphs to a general directed network by taking the nodes with at least one out-link and treating them as the ‘top’ set, whereas the nodes with at least in-link are considered as the ‘bottom’ set. Although the nodes with both in- and out-links appear in both sets, this is not a problem, since there is a one to one correspondence between the links in the original directed network and the links in the newly defined bipartite graph. Once we have found a maximal matching in the bipartite graph, we can map it back onto the original directed network, where the nodes with an incoming matching link are considered as matched nodes and nodes without any incoming matching link are considered un-matched.

According to Ref. [58], for any directed network with a perfect matching the minimum number of driver nodes nD needed to fully control the network is nD = 1 (with an arbitrary choice of the driver node), whereas if the network has no perfect matching, nD is equal to the number of unmatched nodes for any maximal matching, and the driver nodes are actually corresponding to the unmatched nodes. Therefore, the controllability of a network and the minimum number of driver nodes needed to fully control the system can be obtained by solving the matching problem on the same graph. However, since the maximal matching is usually not unique, it can easily happen that the same node is considered as a driver node for one particular maximal matching, and as a controlled node for another maximal matching.

In order to quantify the importance of a given node from the point of view of control, the concept of control centrality was introduced in Ref. [59], circumventing the above mentioned ambiguity in the actual role of the node over the different possible maximal matchings. To calculate the control centrality C(i) for a given node i, first we have to consider the sub-graph that is reachable from i (by following recursively the out-links), and add an incoming external control link to i, making it a driver node. The control centrality of i is then given by the maximum number of controlled nodes in the reachable sub-graph (the number of links in the maximum matching for this sub-graph). The intuitive meaning of C(i) is that it corresponds to the maximum number of nodes that we can drive in the system by controlling the given node using an external signal. The relative control centrality used in the present work is simply C(i) normalised by the number of nodes as c(i) = C(i)/N. In our studies of the control centrality we used the Hopcroft-Karp algorithm [68] for calculating the number of links in the maximum matching for the reachable sub-graphs of the CpG dinucleotides in the methylation network.

Biological role of top identified genes

One CpG dinucleotide (cg02047577), in the UCKL1 gene, was common in all three top 20 lists. The protein encoded by the UCKL1 gene is a uridine kinase and it is involved in the pyrimidine metabolism pathway. Methylation at this CpG site in the UCKL1 gene negatively correlates with age and it has been shown that the gene expression of UCKL1 is increasing during ageing in ovary [69] and skin [70].

Worth highlighting the BAZ2A gene which is an essential component of the NoRC complex. This complex mediates the silencing of a fraction of ribosomal DNA [71] and heterochromatin formation at centromeres and telomeres [72]. In the complex, BAZ2A plays a central role by being involved in the recruitment of chromatin modifying enzymes, such as HDAC1, DNMTs and ISWI-ATPase nucleosomal constriction machinery, resulting in collaborative silencing [73].

A comprehensive methylation analysis of CpG sites in DNA from blood cells identified 102 age-associated CpGs and showed a positive correlation between hypermethylation of SCGN and age [74]. The SCGN gene is involved in the Ca, cAMP and Lipid Signaling pathway which regulates various cellular functions, including cell growth, cell differentiation, gene transcription and protein expression [75].

In a study examining DNA methylation of human brain tissue samples, a CpG site (cg14424579) in the AGBL5 gene showed a highly significant positive correlation with chronological age [76]. The protein, encoded by the AGBL5 gene, has a deglutamylase activity and it has been indicated to be involved in the regulation of the immune response to DNA viruses [77] which may be related to the well known age-associated changes of the immune system and the increased susceptibility of elderly individuals to infectious diseases. TNFRSF13C (Tumour Necrosis Factor Receptor Superfamily Member 13C) is the principal receptor required for BAFF-mediated B-cell maturation and survival which may be connected to the age-associated changes in the B-cell lineage [78]. The protein encoded by the TNFRSF13C gene plays a role in multiple essential pathways, such as the Cytokine-cytokine receptor interaction pathway and the NF-κB pathway. The latter is known to be one of the key mediators of ageing activated by genotoxic, oxidative and inflammatory stress [79].

The protein encoded by the CEBPD gene is an important transcription factor regulating the expression of genes involved in immune and inflammatory responses and may be involved in the regulation of genes associated with activation and/or differentiation of macrophages. Furthermore, the RXRA (Retinoid X Receptor Alpha) protein acts as a transcription factor involved in the regulation of gene expression in various biological processes, for example, plays a role in the attenuation of the innate immune system in response to viral infections [80] and involved in the regulation of calcium signalling and cellular senescence [81] which is a known hallmark of ageing. The protein encoded by the NR2F2 gene is a ligand-activated transcription factor that is involved in the regulation of many different genes within various pathways, such as the Oct4 in Mammalian ESC Pluripotency and the Regulation of Telomerase. Interestingly, CEBPD, RXRA and NR2F2 are super-enhancer-associated transcription factors identified in multiple cell types [82]. Super-enhancers are clusters of enhancers in the mammalian genome bound by a number of transcription factors and coactivators driving high-level expression of key regulators of cell identity, although showing exceeding vulnerability to perturbation of their components [83]. In a recent study, it has been shown that site-specific demethylation of CEBPB/D-dependent adipogenic super-enhancers mediated by the GADD45α-ING1 complex directly controls energy metabolism and ageing in mice [84]. A subset of super-enhancer-associated transcription factors, called the Yamanaka factors (Oct3/4, Sox2, Klf4, c-Myc), have a critical role in the regulation of the developmental signalling network [85] and the overexpression of these factors in mouse fibroblasts induces them to become pluripotent stem cells that can differentiate into almost any other cell type [86]. Reprogramming of cells with the Yamanaka factors causes epigenetic changes and the molecular markers of ageing can be slowed down and even reversed by reprogramming [87]. In addition, the Horvath epigenetic clock is reset to zero in induced pluripotent stem cells, such as in embryonic stem cells [5]. Taking this together we may speculate that DNA methylation changes in CEBPD, RXRA and NR2F2 as super-enhancer-associated transcription factors may contribute to the dysregulation of developmental genes and the loss of cellular identity observed during ageing.

Supporting information

S1 Text

Figure A: “Expected changes in the estimated age” Figure B: “Hierarchy of the methylation network based on the Skin-Blood clock” Figure C: “Hierarchy of the methylation network based on the Hannum’s clock” Figure D: “Control centrality and reach in the network defined based on the Skin-Blood clock” Figure E: “Control centrality and reach in the network defined based on the Hannum’s clock” Figure F: Scatter plot of expected change in the estimated age as a function of the the m-reach and the control centrality for the network based on the Skin-Blood clock. Figure G: Scatter plot of expected change in the estimated age as a function of the the m-reach and the control centrality for the network based on the Hannum’s clock. Figure H: Hierarchy of the methylation networks based on randomly chosen CpGs. Figure I: Control centrality and reach in methylation networks based on randomly chosen CpGs. Figure J: Hierarchy of the methylation networks based on the data set studied by Lehne et. al. Figure K: Control centrality and reach in methylation networks based on the data set studied by Lehne et. al. Figure L: Scatter plot of the expected change in the estimated age as a function of the the m-reach and the control centrality for the network based on the data studied by Lehne et. al. Figure M: Correlation between the m-reach obtained in the networks based on the Hannum et. al and on the Lehne et al. dataset. Figure N: Distribution of the top 10% of the CpGs according to the Lehne et al. hierarchy in the hierarchy based on the Hannum et al. network. Figure O: Distribution of the top 10% of the CpGs according to the Hannum et al. hierarchy in the hierarchy based on the Lehne et al. network. Figure P: Age distribution of the patients in the studied data sets.

(PDF)

Data Availability

The complete methylation profiles and the metadata we use is publicly available in NCBI’s Gene Expression Omnibus (GEO) under accession number GSE40279.

Funding Statement

The research was partially supported by the Velux Foundation 00018310 (JB) and by the Hungarian National Research, Development and Innovation Office K128780 (PG, PP) and by the Hungarian National Research, Development and Innovation Office NVKP\_16-1-2016-0004 (JB, AM, BM, ICs) and by the Research Excellence Programme of the Ministry for Innovation and Technology in Hungary, within the framework of the Digital Biomarker thematic programme of the Semmelweis University (GP, PP) and the NRDI Office within the framework of the Artificial Intelligence National Laboratory Program (ICs). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Baker G, Sprott R. Biomarkers of aging. Exp Gerontol. 1988;23:223–239. doi: 10.1016/0531-5565(88)90025-3 [DOI] [PubMed] [Google Scholar]
  • 2.Warner HR. The future of aging interventions. J Gerontol A Biol Sci Med Sci. 2004;59:B692–B696. doi: 10.1093/gerona/59.7.B692 [DOI] [Google Scholar]
  • 3.Jylhävä J, Pedersen NL, Hägg S. Biological Age Predictors. EBioMedicine. 2017;21:29–36. doi: 10.1016/j.ebiom.2017.03.046 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Field AE, Wang T, Havas A, Ideker T, Adams PD. DNA Methylation Clocks in Aging: Categories, Causes, and Consequences. Mol Cell. 2018;71:882–895. doi: 10.1016/j.molcel.2018.08.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Horvath S. DNA methylation age of human tissues and cell types. Genome Biol. 2013;14:R115. doi: 10.1186/gb-2013-14-10-r115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Jylhävä J, Pedersen NL, Hägg S. Biological age predictors. EBioMedicine. 2017;21:29–36. doi: 10.1016/j.ebiom.2017.03.046 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Lee HY, ’Lee SD, Shin KJ. Forensic DNA methylation profiling from evidence material for investigative leads. BMB Rep. 2016;49:359–369. doi: 10.5483/BMBRep.2016.49.7.070 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Levine ME, Lu AT, Chen BH, Hernandez DG, Singleton AB, Ferrucci L, et al. Menopause accelerates biological aging. Proc Natl Acad Sci USA. 2016;113:9327–9332. doi: 10.1073/pnas.1604558113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Huh CJ, Zhang B, Victor MB, Dahiya S, Batista LF, Horvath S, et al. Maintenance of age in human neurons generated by microRNA-based neuronal conversion of fibroblasts. eLife. 2016;5:e18648. doi: 10.7554/eLife.18648 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Horvath S, Gurven M, Levine ME, Trumble BC, Kaplan H, Allayee H, et al. An epigenetic clock analysis of race/ethnicity, sex, and coronary heart disease. Genome Biology. 2016;17(1):171. doi: 10.1186/s13059-016-1030-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Quach A, Levine ME, Tanaka T, Lu AT, Chen BH, Ferrucci L, et al. Epigenetic clock analysis of diet, exercise, education, and lifestyle factors. Aging. 2017;9(2):419–446. doi: 10.18632/aging.101168 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Horvath S, Garagnani P, Bacalini MG, Pirazzini C, Salvioli S, Gentilini D, et al. Accelerated epigenetic aging in Down syndrome. Aging Cell. 2015;14(3):491–495. doi: 10.1111/acel.12325 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Berdyshev G, Korotaev G, Boiarskikh G, Vaniushin B. Nucleotide composition of DNA and RNA from somatic tissues of humpback and its changes during spawning. Biokhimiia. 1967;31:988–993. [PubMed] [Google Scholar]
  • 14.Ahuja N, Li Q, Mohan AL, Baylin SB, Issa JP. Aging and DNA methylation in colorectal mucosa and cancer. Cancer Res. 1998;58:5489–5494. [PubMed] [Google Scholar]
  • 15.Fraga MF, Esteller M. Epigenetics and aging: the targets and the marks. Trends in Genetics. 2007;23(8):413—418. doi: 10.1016/j.tig.2007.05.008 [DOI] [PubMed] [Google Scholar]
  • 16.Bollati V, Schwartz J, Wright R, Litonjua A, Tarantini L, Suh H, et al. Decline in genomic DNA methylation through aging in a cohort of elderly subjects. Mechanisms of Ageing and Development. 2009;130(4):234—239. doi: 10.1016/j.mad.2008.12.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Christensen BC, Houseman EA, Marsit CJ, Zheng S, Wrensch MR, Wiemels JL, et al. Aging and environmental exposures alter tissue-specific DNA methylation dependent upon CpG island context. PLoS Genet. 2009;5:e1000602. doi: 10.1371/journal.pgen.1000602 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Rodríguez-Rodero S, Fernández-Morera J, Fernandez A, Menéndez-Torre E, Fraga M. Epigenetic regulation of aging. Discov Med. 2010;10:225–233. [PubMed] [Google Scholar]
  • 19.Mugatroyd C, Wu Y, Bockmühl Y, Spengler D. The Janus face of DNA methylation in aging. Aging. 2010;2(2):107–110. doi: 10.18632/aging.100124 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Teschendorff AE, Menon U, Gentry-Maharaj A, Ramus SJ, Weisenberger DJ, Shen H, et al. Age-dependent DNA methylation of genes that are suppressed in stem cells is a hallmark of cancer. Genome Research. 2010;20(4):440–446. doi: 10.1101/gr.103606.109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Bell JT, Tsai PC, Yang TP, Pidsley R, Nisbet J, Glass D, et al. Epigenome-wide scans identify differentially methylated regions for age and age-related phenotypes in a healthy ageing population. PLoS Genet. 2012;8:e1002629. doi: 10.1371/journal.pgen.1002629 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Zheng SC, Widschwendter M, Teschendorff AE. Epigenetic drift, epigenetic clocks and cancer risk. Epigenomics. 2016;8(5):705–719. doi: 10.2217/epi-2015-0017 [DOI] [PubMed] [Google Scholar]
  • 23.Feil R, Fraga MF. Epigenetics and the Environment: Emerging Patterns and Implications. Nat Rev Genet. 2012;13:97–109. doi: 10.1038/nrg3142 [DOI] [PubMed] [Google Scholar]
  • 24.Slieker RC, van Iterson M, Luijk R, Beekman M, Zhernakova DV, Moed MH, et al. Age-related accrual of methylomic variability is linked to fundamental ageing mechanisms. Genome Biol. 2016;17:191. doi: 10.1186/s13059-016-1053-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Horvath S, Raj K. DNA methylation-based biomarkers and the epigenetic clock theory of ageing. Nature Reviews Genetics. 2018;19:371–384. doi: 10.1038/s41576-018-0004-3 [DOI] [PubMed] [Google Scholar]
  • 26.Bell CG, Lowe R, Adams PD, Baccarelli AA, Beck S, Bell JT, et al. DNA methylation aging clocks: challenges and recommendations. Genome Biol. 2019;20:249. doi: 10.1186/s13059-019-1824-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Bocklandt S, Lin W, Sehl ME, Sánchez FJ, Sinsheimer JS, Horvath S, et al. Epigenetic Predictor of Age. PLoS ONE. 2011;6:e14821. doi: 10.1371/journal.pone.0014821 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Hannum G, Guinney J, Zhao L, Zhang L, Hughes G, Sadda S, et al. Genome-wide Methylation Profiles Reveal Quantitative Views of Human Aging Rates. Molecular Cell. 2013;49(2):359—367. doi: 10.1016/j.molcel.2012.10.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Bernstein BE, Stamatoyannopoulos JA, Costello JF, Ren B, Milosavljevic A, Meissner A, et al. The NIH Roadmap Epigenomics Mapping Consortium. Nat Biotechnol. 2010;28:1045–1048. doi: 10.1038/nbt1010-1045 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Li Y, Zhu J, Tian G, Li N, Li Q, Ye M, et al. The DNA Methylome of Human Peripheral Blood Mononuclear Cells. PLOS Biology. 2010;8(11):1–9. doi: 10.1371/journal.pbio.1000533 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Thompson RF, Atzmon G, Gheorghe C, Liang HQ, Lowes C, Greally JM, et al. Tissue-specific dysregulation of DNA methylation in aging. Aging Cell. 2010;9(4):506–518. doi: 10.1111/j.1474-9726.2010.00577.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Baubec T, Schübeler D. Genomic patterns and context specific interpretation of DNA methylation. Current Opinion in Genetics & Development. 2014;25:85—92. doi: 10.1016/j.gde.2013.11.015 [DOI] [PubMed] [Google Scholar]
  • 33.Horvath S, Oshima J, Martin GM, Lu AT, Quach A, Cohen H, et al. Epigenetic clock for skin and blood cells applied to Hutchinson Gilford Progeria Syndrome and ex vivo studies. Aging. 2018;10:1758–1775. doi: 10.18632/aging.101508 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Zhang Q, Vallerga CL, Walker RM, Lin T, Henders AK, Montgomery GW, et al. Improved precision of epigenetic clock estimates across tissues and its implication for biological ageing. Genome Med. 2019;11:54. doi: 10.1186/s13073-019-0667-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Levine ME, Lu AT, Quach A, Chen BH, Assimes TL, Bandinelli S, et al. An epigenetic biomarker of aging for lifespan and healthspan. Aging. 2018;10:573–591. doi: 10.18632/aging.101414 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Lu AT, Quach A, Wilson JG, Reiner AP, Aviv A, Raj K, et al. DNA methylation GrimAge strongly predicts lifespan and healthspan. Aging. 2019;11:303–327. doi: 10.18632/aging.101684 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Fahy GM, Brooke RT, Watson JP, Good Z, Vasanawala SS, Maecker H, et al. Reversal of epigenetic aging and immunosenescent trends in humans. Aging Cell. 2019;18(6):e13028. doi: 10.1111/acel.13028 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Garagnani P, Bacalini MG, Pirazzini C, Gori D, Giuliani C, Mari D, et al. Methylation of ELOVL2 gene as a new epigenetic marker of age. Aging Cell. 2012;11(6):1132–1134. doi: 10.1111/acel.12005 [DOI] [PubMed] [Google Scholar]
  • 39.Slieker RC, Relton CL, Gaunt TR, Slagboom PE, Heijmans BT. Age-related DNA methylation changes are tissue-specific with ELOVL2 promoter methylation as exception. Epigenetics & Chromatin. 2018;11:25. doi: 10.1186/s13072-018-0191-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Rimmelé P, Bigarella CL, Liang R, Izac B, Dieguez-Gonzalez R, Barbet G, et al. Aging-like phenotype and defective lineage specification in SIRT1-deleted hematopoietic stem and progenitor cells. Stem Cell Reports. 2014;3:44–59. doi: 10.1016/j.stemcr.2014.04.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Lappalainen T, Greally JM. Associating cellular epigenetic models with human phenotypes. Nat Rev Genet. 2017;18:441–451. doi: 10.1038/nrg.2017.32 [DOI] [PubMed] [Google Scholar]
  • 42.Rakyan VK, Down TA, Maslau S, Andrew T, Yang TP, Beyan H, et al. Human aging-associated DNA hypermethylation occurs preferentially at bivalent chromatin domains. Genome Research. 2010;20(4):434–439. doi: 10.1101/gr.103101.109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Albert R, Barabási AL. Statistical mechanics of complex networks. Rev Mod Phys. 2002;74:47–97. doi: 10.1103/RevModPhys.74.47 [DOI] [Google Scholar]
  • 44.Newman MEJ, Barabási AL, Watts DJ, editors. The Structure and Dynamics of Networks. Princeton and Oxford: Princeton University Press; 2006. [Google Scholar]
  • 45.Holme P, Saramäki J, editors. Temporal Networks. Berlin: Springer; 2013. [Google Scholar]
  • 46.Barrat A, Barthelemy M, Vespignani A. Dynamical processes on complex networks. Cambridge: Cambridge University Press; 2008. [Google Scholar]
  • 47.Zafeiris A, Vicsek T. Why We Live in Hierarchies? A Quantitative Treatise. Berlin: Springer; 2018. [Google Scholar]
  • 48.Ma HW, Buer J, Zeng AP. Hierarchical sructure and modules in the Escherichia coli transcriptional regulatory network revealed by a new top-down approach. BMC Bioinformatics. 2004;5:199. doi: 10.1186/1471-2105-5-199 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Nagy M, Akos Z, Biro D, Vicsek T. Hierarchical group dynamics in pigeon flocks. Nature. 2010;464:890–893. doi: 10.1038/nature08891 [DOI] [PubMed] [Google Scholar]
  • 50.Fushing H, McAssey MP, Beisner B, McCowan B. Ranking network of captive rhesus macaque society: A sophisticated corporative kingdom. PLoS ONE. 2011;6:e17817. doi: 10.1371/journal.pone.0017817 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Tibély G, Sousa-Rodrigues D, Pollner P, Palla G. Comparing the Hierarchy of Keywords in On-Line News Portals. PLoS ONE. 2016;11:e0165728. doi: 10.1371/journal.pone.0165728 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Palla G, Tibély G, Mones E, Pollner P, Vicsek T. Hierarchical networks of scientific journals. Palgrave Communications. 2015;1:15016. doi: 10.1057/palcomms.2015.16 [DOI] [Google Scholar]
  • 53.Guimerà R, Danon L, Díaz-Guilera A, Giralt F, Arenas A. Self-similar community structure in a network of human interactions. Phys Rev E. 2003;68:065103. doi: 10.1103/PhysRevE.68.065103 [DOI] [PubMed] [Google Scholar]
  • 54.Pollner P, Palla G, Vicsek T. Preferential attachment of communities: The same principle, but a higher level. Europhys Lett. 2006;73:478–484. doi: 10.1209/epl/i2005-10414-6 [DOI] [Google Scholar]
  • 55.Tóth BJ, Palla G, Mones E, Havadi G, Páll N, Pollner P, et al. Emergence of Leader-Follower Hierarchy Among Players in an On-Line Experiment. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM); 2018. p. 1184–1190.
  • 56.Wickens J, Ulanowicz R. On quantifying hierarchical connections in ecology. J Soc Biol Struct. 1988;11:369–378. doi: 10.1016/0140-1750(88)90066-8 [DOI] [Google Scholar]
  • 57.Eldredge N. Unfinished Synthesis: Biological Hierarchies and Modern Evolutionary Thought. New York: Oxford Univ. Press; 1985. [Google Scholar]
  • 58.Liu YY, Slotine JJ, Barabási AL. Controllability of complex networks. Nature. 2011;473:167–173. doi: 10.1038/nature10011 [DOI] [PubMed] [Google Scholar]
  • 59.Liu YY, Slotine JJ, Barabási AL. Control Centrality and Hierarchical Structure in Complex Networks. PLoS ONE. 2012;7:e44459. doi: 10.1371/journal.pone.0044459 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.De Vico Fallani F, Latora V, Chavez M. A Topological Criterion for Filtering Information in Complex Brain Networks. PLoS Comput Biol. 2017;13:e1005305. doi: 10.1371/journal.pcbi.1005305 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Mones E, Vicsek L, Vicsek T. Hierarchy Measure for Complex Networks. PLoS ONE. 2012;7:e33799. doi: 10.1371/journal.pone.0033799 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Lehne B, Drong AW, Loh M, Zhang W, Scott WR, Tan ST, et al. A coherent approach for analysis of the Illumina HumanMethylation450 BeadChip improves data quality and performance in epigenome-wide association studies. Genome Biol. 2015;16:37. doi: 10.1186/s13059-015-0600-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Latora V, Marchiori M. Efficient Behavior of Small-World Networks. Phys Rev Lett. 2001;87:198701. doi: 10.1103/PhysRevLett.87.198701 [DOI] [PubMed] [Google Scholar]
  • 64.Molloy M, Reed B. A critical point for random graphs with a given degree sequence. Random Structures & Algorithms. 1995;6:161–180. doi: 10.1002/rsa.3240060204 [DOI] [Google Scholar]
  • 65.Kalman RE. Mathematical description of linear dynamical systems. J Soc Indus and Appl Math Ser A. 1963;1:152. doi: 10.1137/0301010 [DOI] [Google Scholar]
  • 66.Luenberger DG. Introduction to Dynamic Systems: Theory, Models, & Applications. New York: John Wiley & Sons; 1979. [Google Scholar]
  • 67.Slotine JJ, Li W. Applied Nonlinear Control. Prentice-Hall; 1991. [Google Scholar]
  • 68.Hopcroft J, Karp R. An n5/2 Algorithm for Maximum Matchings in Bipartite Graphs. SIAM Journal on Computing. 1973;2(4):225–231. doi: 10.1137/0202019 [DOI] [Google Scholar]
  • 69.Grøndahl ML, Yding Andersen C, Bogstad J, Nielsen FC, Meinertz H, Borup R. Gene expression profiles of single human mature oocytes in relation to age. Human Reproduction. 2010;25(4):957–968. doi: 10.1093/humrep/deq014 [DOI] [PubMed] [Google Scholar]
  • 70.Glass D, Viñuela A, Davies MN, Ramasamy A, Parts L, Knowles D, et al. Gene expression changes with age in skin, adipose tissue, blood and brain. Genome Biology. 2013;14(7). doi: 10.1186/gb-2013-14-7-r75 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Anosova I, Melnik S, Tripsianes K, Kateb F, Grummt I, Sattler M. A novel RNA binding surface of the TAM domain of TIP5/BAZ2A mediates epigenetic regulation of rRNA genes. Nucleic Acids Research. 2015;43(10):5208–5220. doi: 10.1093/nar/gkv365 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Postepska-Igielska A, Krunic D, Schmitt N, Greulich-Bode KM, Boukamp P, Grummt I. The chromatin remodelling complex NoRC safeguards genome stability by heterochromatin formation at telomeres and centromeres. EMBO reports. 2013;14(8):704–710. doi: 10.1038/embor.2013.87 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Mazzio EA, Soliman KFA. Basic concepts of epigenetics. Epigenetics. 2012;7(2):119–130. doi: 10.4161/epi.7.2.18764 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Weidner CI, Lin Q, Koch CM, Eisele L, Beier F, Ziegler P, et al. Aging of blood can be tracked by DNA methylation changes at just three CpG sites. Genome Biology. 2014;15(2). doi: 10.1186/gb-2014-15-2-r24 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Yan K, Gao L, Cui Y, Zhang Y, Zhou X. The cyclic AMP signaling pathway: Exploring targets for successful drug discovery (Review). Molecular Medicine Reports. 2016;13(5):3715–3723. doi: 10.3892/mmr.2016.5005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Hernandez DG, Nalls MA, Gibbs JR, Arepalli S, van der Brug M, Chong S, et al. Distinct DNA methylation changes highly correlated with chronological age in the human brain. Human Molecular Genetics. 2011;20:1164–72. doi: 10.1093/hmg/ddq561 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Xia P, Ye B, Wang S, Zhu X, Du Y, Xiong Z, et al. Glutamylation of the DNA sensor cGAS regulates its binding and synthase activity in antiviral immunity. Nature Immunology. 2016;17:369–378. doi: 10.1038/ni.3356 [DOI] [PubMed] [Google Scholar]
  • 78.Cancro MP, Hao Y, Scholz JL, Riley RL, Frasca D, Dunn-Walters DK, et al. B cells and aging: molecules and mechanisms. Trends in Immunology. 2009;30(7):313–318. doi: 10.1016/j.it.2009.04.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Kanigur Sultuybek G, Soydas T. NF-kappaB as the mediator of metformin’s effect on ageing and ageing-related diseases. Clin Exp Pharmacol Physiol. 2019;46(5):413–422. doi: 10.1111/1440-1681.13073 [DOI] [PubMed] [Google Scholar]
  • 80.Ma F, Liu SY, Razani B, Arora N, Li B, Kagechika H, et al. Retinoid X receptor alpha attenuates host antiviral response by suppressing type I interferon. Nature communications. 2014;5:5494. doi: 10.1038/ncomms6494 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Ma X, Warnier M, Raynard C, Ferrand M, Kirsh O, Defossez PA, et al. The nuclear receptor RXRA controls cellular senescence by regulating calcium signaling. Aging Cell. 2018;17(6):e12831. doi: 10.1111/acel.12831 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Hnisz D, Abraham BJ, Lee TI, Lau A, Saint-André V, Sigova AA, et al. Super-enhancers in the control of cell identity and disease. Cell. 2013;155(4):934–947. doi: 10.1016/j.cell.2013.09.053 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Whyte WA, Orlando DA, Hnisz D, Abraham BJ, Lin CY, Kagey MH, et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell. 2013;153(2):307–319. doi: 10.1016/j.cell.2013.03.035 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Schäfer A, Mekker B, Mallick M, Vastolo V, Karaulanov E, Sebastian D, et al. Impaired DNA demethylation of C/EBP sites causes premature aging. Genes & Development. 2018;32(11-12):742–762. doi: 10.1101/gad.311969.118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Liu X, Huang J, Chen T, Wang Y, Xin S, Li J, et al. Yamanaka factors critically regulate the developmental signaling network in mouse embryonic stem cells. Cell Research. 2008;18(12):1177–89. doi: 10.1038/cr.2008.309 [DOI] [PubMed] [Google Scholar]
  • 86.Takahashi K, Yamanaka S. Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell. 2006;126(4). doi: 10.1016/j.cell.2006.07.024 [DOI] [PubMed] [Google Scholar]
  • 87.Rando TA, Chang HY. Aging, Rejuvenation, and Epigenetic Reprogramming: Resetting the Aging Clock. Cell. 2012;148(1):46–57. doi: 10.1016/j.cell.2012.01.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009327.r001

Decision Letter 0

Douglas A Lauffenburger, Ilya Ioshikhes

18 Jun 2020

Dear Dr Pollner,

Thank you very much for submitting your manuscript "Hierarchy and control of ageing-related methylation networks" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Ilya Ioshikhes

Associate Editor

PLOS Computational Biology

Douglas Lauffenburger

Deputy Editor

PLOS Computational Biology

***********************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Please see attachment

Palla et al. have produced a manuscript entitled “Hierarchy and control of ageing-related methylation networks.” In this paper they have extracted an interaction network from the CpGs employed in the Horvath clock. Unsurprisingly, this shows some hierarchical organisation is present. They then go on to discuss how modifying the clock will led to age ‘reversal’. Unfortunately, there are significant issues in the design and conclusions drawn from this study, due to imprecise understanding of the ageing biology of the epigenome, as well as the construction and interpretation of the Horvath clock. The researchers have performed a network analysis focused only the 353 CpGs from this specific clock, without acknowledging that these CpGs themselves are not uniquely special in regard to their functionality. The discussion of age-reversal gives these methylation sites a definitely active role in the ageing process which they do not possess. My concerns are listed below.

Major

1. The statements in the Abstract that it is “plausible to assume that by proper adjustment of these switches age may be tuned” and that “biological clock can be changed or even reversed” – are counter to the current understanding of the field and imply that the clock itself is driving ageing rather than a ‘biomarker’ of the ageing process and the plethora of ageing-related changes it is capturing1. The clock itself is used to measure the impact of potential interventions2.

2. Furthermore, the statement that “adjustment of one leads to a cascade of changes at other sites” is not surprising if one understands what biological and connected epigenetic changes will be represented, as in this case of blood tissue derived DNA3.

3. The statement in Abstract and elsewhere that ‘we locate the most important CpGs’ ignores the fact that they limit their analysis to only the 353 CpG from the total DNA methylome of 28 million CpGs to begin with. As Horvath has stated there is no evidence that the CpGs in the Horvath clock are especially functional over and above many other CpGs and reasonable clocks can be constructed from even a random selection of CpGs - there are abundant potential CpGs that can be exploited in clocks3. The statement “largest influence” and “which may also play a crucial role in the process of ageing” (Introduction, Line 94) again implies these small fraction of 353 CpGs are uniquely special4.

4. Age-related change in DNA methylome is in fact widespread with up to 15 – 30% of all CpG sites in the genome associated with age-related changes and these are not all called ‘clock CpGs’ (Introduction line 18). Change can be random fashion due to epigenomic drift5, directional, or show increased variability with age6. Also, the statement regarding the directionality of “clock CpGs that are hypermethylated” (Introduction line 35) is an oversimplification. Teschendorff et al. identified an enrichment in an early promoter-focused array for age-related CpGs that were hypermethylating in the Targets of Polycomb Target gene promoters, but genome-wide hypomethylation predominates. Both hypo- and hypermethylated loci contribute to the various published clocks.

5. The statement in the Introduction that there are “connections between the CpGs themselves’ (line 75) is as expected. Clearly all well-known ageing effects lead to co-ordinated changes across the entire DNA methylome – these include those driven by cell-type specific epigenomics where changes in cell proportion will led to variation (including the age-related myeloid skew7, T cell exhaustion)8, polycomb target hypermethylation9, bivalent domain hypermethylation10, etc. These known systemic effects will be seen as networks of age-related change.

6. Distinct biological processes drive the observed age-related hypermethylation and hypomethylation. Furthermore, the baseline DNA methylation state is strongly driven by genetics being highly CpG density dependent11.

7. The statement (line 53) that “we cannot really point out any of these CpGs as being more important than others” is as completely expected in the way that the elastic net regression Horvath clock was designed. CpGs were selected not for their individual strength but chosen for their power to work collectively to parsimoniously capture ageing over the lifecourse. In fact, this is clearly demonstrated by the fact the strongest and most robust individual CpG pan-tissue changes from the ELOVL2 locus12,13 were not included in the clock. Additionally, an accurate clock has been devised using just 3 CpGs14.

8. The discussion of “control properties” of CpGs is consistent with the Elastic Net picking those CpGs that work well together. Thus, the results regarding network identification and properties have ignored this and the limited CpGs this has been exacted from e.g. Results (line 112). Why were not all the ~850,000 CpGs from the EPIC array analysed in the network analysis rather than just 353? Conclusion statements regarding how a “network approach can bring new insight into methylation-related studies, providing a very interesting direction for further research” (Line 389) are clearly limited when restricted to only these 353 CpGs and known biology not taken into account.

9. The authors need to explain and understand more precisely what the concept of ‘biological age’ and predicators of this represent15. The initial Horvath clock was devised as an attempt at a ‘pan-tissue’ clock (which it was highly successful in although caveats remain16,17). It is in fact a ‘composite’ clock3 capturing both forensic and biological age but neither perfectly. The authors need to understand and integrate the current knowledge and issues regarding DNA methylation clocks - as discussed recently by the epigenomics community4.

10. The statements regarding “Modifying the predicted age by perturbing the methylation network’ need to be put in the context that they are interpreting a ‘biomarker’ of biological ageing.

11. Unclear what “more aligned with the 'natural direction of ageing'.” (Line 283) means biologically?

12. In the Discussion the statement ‘Horvath's clock is showing non-trivial hierarchical and control properties’ – how is this unexpected? Furthermore, how would that be different from a random selection of array-derived CpG probes?

13. The statements regarding the functional implications of individual CpGs in the Discussion need to be more clearly caveated8.

14. In the Conclusion (line 374) the statement “substantially more hierarchical compared to a random Graph” does not take into consideration the biological nature of these data.

Minor

1. English needs correcting throughout manuscript

2. Abstract – Grammar - “…biomarkers of ageing”

3. “specific CpG pairs” line 20 – CpG ‘dinucleotides’ is usually stated as more precise

4. Spelling line 33 - DNA methylation

5. Gene names are by convention written in italics – e.g. UCKL1 gene (line 314) etc.

1. Horvath, S. & Raj, K. DNA methylation-based biomarkers and the epigenetic clock theory of ageing. Nat Rev Genet 19, 371-384 (2018).

2. Fahy, G.M. et al. Reversal of epigenetic aging and immunosenescent trends in humans. Aging Cell 18, e13028 (2019).

3. Field, A.E. et al. DNA Methylation Clocks in Aging: Categories, Causes, and Consequences. Mol Cell 71, 882-895 (2018).

4. Bell, C.G. et al. DNA methylation Aging Clocks: Challenges & Recommendations. Genome Biology (2019).

5. Feil, R. & Fraga, M.F. Epigenetics and the environment: emerging patterns and implications. Nature reviews. Genetics 13, 97-109 (2011).

6. Slieker, R.C. et al. Age-related accrual of methylomic variability is linked to fundamental ageing mechanisms. Genome Biol 17, 191 (2016).

7. Rimmelé, P. et al. Aging-like Phenotype and Defective Lineage Specification in SIRT1-Deleted Hematopoietic Stem and Progenitor Cells. Stem Cell Reports 3, 44-59 (2014).

8. Lappalainen, T. & Greally, J.M. Associating cellular epigenetic models with human phenotypes. Nat Rev Genet 18, 441-451 (2017).

9. Teschendorff, A.E. et al. Age-dependent DNA methylation of genes that are suppressed in stem cells is a hallmark of cancer. Genome Res 20, 440-6 (2010).

10. Rakyan, V.K. et al. Human aging-associated DNA hypermethylation occurs preferentially at bivalent chromatin domains. Genome Res 20, 434-9 (2010).

11. Baubec, T. & Schübeler, D. Genomic patterns and context specific interpretation of DNA methylation. Current Opinion in Genetics & Development 25, 85-92 (2014).

12. Garagnani, P. et al. Methylation of ELOVL2 gene as a new epigenetic marker of age. Aging Cell 11, 1132-4 (2012).

13. Slieker, R.C., Relton, C.L., Gaunt, T.R., Slagboom, P.E. & Heijmans, B.T. Age-related DNA methylation changes are tissue-specific with ELOVL2 promoter methylation as exception. Epigenetics & Chromatin 11, 25 (2018).

14. Weidner, C.I. et al. Aging of blood can be tracked by DNA methylation changes at just three CpG sites. Genome Biol 15, R24 (2014).

15. Jylhävä, J., Pedersen, N.L. & Hägg, S. Biological Age Predictors. EBioMedicine 21, 29-36 (2017).

16. Horvath, S. et al. Epigenetic clock for skin and blood cells applied to Hutchinson Gilford Progeria Syndrome and ex vivo studies. Aging (Albany NY) 10, 1758-1775 (2018).

17. Zhang, Q. et al. Improved precision of epigenetic clock estimates across tissues and its implication for biological ageing. Genome Med 11, 54 (2019).

Reviewer #2: The study by Palla et al., entitled “Hierarchy and control age-related methylation networks”, revealed that the age-related CpGs are interconnected, with dynamic methylation change on one CpG probably leading to a cascade of changes at the other sites. It provided a framework to explore the key methylation sites during ageing process, which might be applied to other biomarkers/biological processes. This study is interesting but remains too preliminary, as the authors only focused on the 353 Horvath’s “clock CpGs”. To better understand the issue raised in the study, a comprehensive analysis of CpGs involved in ageing and age-related phenotypes/diseases should be considered by collecting more methylation data. In this case, the manuscript needs to be revised thoroughly before considering for publication.

Major concerns:

1) The Introduction section is poorly summarized. Authors need simplify the content and clarify the background and purpose of the study.

2) Evidence supporting the leading roles of identified CpGs during ageing is insufficient. For example, the training model should be tested in multiple datasets. And, if possible, it will be great if some functional assays are performed.

3) How shall we view the key CpGs’ roles in ageing? It’s hard to determine whether the methylation status is the result or the cause of ageing process. Authors should discuss this in Discussion section.

4) Whether the training model can be used to scan the key CpGs that control various biological processes?

5) Is there any correlation between a certain CpG’s methylation status and its hierarchy level? (For example, sites located on higher levels may also have lower methylation values.)

6) The network is based on only 353 sites. When considering the CpGs across whole genome, the perturbation results may be different or even opposite. Authors may consider adding some data or results to demonstrate the robustness of the perturbation results. Generally, authors should, at least, provide evidence showing that the 353 “clock sites” are less affected by “non-clock sites”.

Minor concerns:

7) The Formula 4 doesn’t render properly in the ms for reviewers.

8) The Discussion section seems too long, it talks too much on genes’ functions. Authors may move and summarize these contents into the Results section.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions, please see http://journals.plos.org/compbiol/s/submission-guidelines#loc-materials-and-methods

Attachment

Submitted filename: Palla et al.pdf

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009327.r003

Decision Letter 1

Douglas A Lauffenburger, Ilya Ioshikhes

21 Sep 2020

Dear Dr. Pollner,

Thank you very much for submitting your manuscript "Hierarchy and control of ageing-related methylation networks" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

It is mandatory that the reviewers will be satisfied by the revisions made in this round. 

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Ilya Ioshikhes

Associate Editor

PLOS Computational Biology

Douglas Lauffenburger

Deputy Editor

PLOS Computational Biology

***********************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: See Attachment

Response to Reviewer 1:

We thank the Referee for the careful and detailed examination of the manuscript and the extremely valuable comments, which indeed have helped making our paper better. We are truly grateful for the 17 bibliographic references included in the report that we now also cited in the revised version of the paper. Our detailed answers to the points raised are the following

Palla et al. have produced a manuscript entitled “Hierarchy and control of ageing-related methylation networks.” In this paper they have extracted an interaction network from the CpGs employed in the Horvath clock. Unsurprisingly, this shows some hierarchical organisation is present. They then go on to discuss how modifying the clock will led to age `reversal'. Unfortunately, there are significant issues in the design and conclusions drawn from this study, due to imprecise understanding of the ageing biology of the epigenome, as well as the

construction and interpretation of the Horvath clock. The researchers have performed a network analysis focused only the 353 CpGs from this specific clock, without acknowledging that these CpGs themselves are not uniquely special in regard to their functionality. The discussion of age-reversal gives these methylation sites a definitely active role in the ageing process which they do not possess.

My concerns are listed below.

Major

1. The statements in the Abstract that it is “plausible to assume that by proper adjustment of these switches age may be tuned" and that “biological clock can be changed or even reversed" are counter to the current understanding of the field and imply that the clock itself is driving ageing rather than a `biomarker' of the ageing process and the plethora of ageing-related changes it is capturing [1]. The clock itself is used to measure the impact of potential interventions [2].

We thank for the referee to point out this possible misunderstanding. It is widely accepted and demonstrated by various epigenome editing studies that DNA methylation is one of the most important factors that control gene expression, activation and splicing, hence many of the biological processes of the living systems. We agree that methylation is only one of the possible factors that control the ageing process and also that the 353 CpG-s in Horvath's clock give only a very small subset of them. We have reworded the abstract and explicitly stated that correlation is not equivalent to causation. We have also acknowledged that we demonstrate our approach only on a small set of CpG sites and to get biologically relevant control nodes, the analysis should be extended to all methylation sites. In the revised version of the manuscript we mention the use of epigenetic clocks for the measurement of the impact of a thymus regeneration protocol as described in Ref.[2], whereas Ref.[1] was cited already in the original submission.

• Good to acknowledge that causation differs from correlation. The point regarding ageing also specifically refers to the biological processes that are observed in blood as DNA methylation variation – as listed in regard to point 5 below.

2. Furthermore, the statement that “adjustment of one leads to a cascade of changes at other sites" is not surprising if one understands what biological and connected epigenetic changes will be represented, as in this case of blood tissue derived DNA [3].

We agree, living things are complex interconnected systems. One of our goals with this paper was to emphasise this fact and to make the first step from the widely used linear models toward network model that may capture some of the complexities. We have reworded the cited sentence to avoid the false interpretation, and inserted a citation to Ref[3] from the referee report into the Introduction.

• The point is not the complex system interconnectedness - but again what the multiple ageing-related changes in DNA methylation represent in blood (cell type changes etc.)

3. The statement in Abstract and elsewhere that “we locate the most important CpGs'” ignores the fact that they limit their analysis to only the 353 CpG from the total DNA methylome of 28 million CpGs to begin with. As Horvath has stated there is no evidence that the CpGs in the Horvath clock are especially functional over and above many other CpGs and reasonable clocks can be constructed from even a random selection of CpGs - there are abundant potential CpGs that can be exploited in clocks [3]. The statement “largest in influence" and “which may also play a crucial role in the process of ageing" (Introduction, Line 94) again implies these small fraction of 353 CpGs are uniquely special [4].

We have refined the mentioned statements, that refer specifically to the studied subset of CpGs in the revised version and put a caveat to the end of the Introduction to remind the reader that the analysis should be extended to get relevant results. (Ref.[4] from the referee report has been also incorporated into the manuscript, as described in the answer to Major point no. 9.)

• Good to acknowledge this caveat to this analysis.

4. Age-related change in DNA methylome is in fact widespread with up to 15-30% of all CpG sites in the genome associated with age-related changes and these are not all called `clock CpGs' (Introduction line 18). Change can be random fashion due to epigenomic drift [5], directional, or show increased variability with age[6]. Also, the statement regarding the directionality of “clock CpGs that are hypermethylated" (Introduction line 35) is an oversimpli_cation. Teschendorf et al. identified an enrichment in an early promoter-focused array for age-related CpGs that were hypermethylating in the Targets of Polycomb Target gene promoters, but genome-wide hypomethylation predominates. Both hypo- and hypermethylated loci contribute to the various published clocks.

We have rephrased the part of the text introducing the clock CpGs, now mentioning that age related CpGs are actually quite common, and that not all of them are called as clock CpGs. The revised version of the manuscript is now citing Refs[5,6] from the referee report. We also replaced 'hypermethylation' by 'age related change' in the sentence referring to the work by Teschendorf et al.

• The term ‘clock CpGs’ implies that these specific CpGs possess special properties. Therefore, this term only leads to confusion and would be better to remove – there are multiple CpGs in the DNA methylome that could be included, or not, into differently constructed DNA methylation clocks [Field et al].

5. The statement in the Introduction that there are “connections between the CpGs themselves” (line 75) is as expected. Clearly all well-known ageing effects lead to co-ordinated changes across the entire DNA methylome these include those driven by cell-type specific epigenomics where changes in cell proportion will led to variation (including the age-related myeloid skew [7], T cell exhaustion) [8], polycomb target hypermethylation [9], bivalent domain hypermethylation [10], etc. These known systemic effects will be seen as networks of

age-related change.

We are especially grateful for this comment, providing extra support for the networked approach we use to study DNA methylation and ageing. This is now incorporated into the text (together with the references), however at a somewhat earlier point, where we first mention connections between the CpGs.

• Good to now include this information.

6. Distinct biological processes drive the observed age-related hypermethylation and hypomethylation. Furthermore, the baseline DNA methylation state is strongly driven by genetics being highly CpG density dependent [11].

We included this important point (together with the reference) in the revised version where we list the diffculties of constructing multi-tissue DNA methylation-based age estimators.

• Good to now include this information.

7. The statement (line 53) that “we cannot really point out any of these CpGs as being more important than others" is as completely expected in the way that the elastic net regression Horvath clock was designed. CpGs were selected not for their individual strength but chosen for their power to work collectively to parsimoniously capture ageing over the lifecourse. In fact, this is clearly demonstrated by the fact the strongest and most robust individual CpG pan-tissue changes from the ELOVL2 locus [12,13] were not included in the clock. Additionally, an accurate clock has been devised using just 3 CpGs [14].

We agree that this is statement is somewhat evident, nevertheless we would like to keep it in the Introduction for helping non-expert readers in understanding the basis of our study. The sentence before this statement already mentioned that the correlation between age and the methylation of individual CpGs from Horvath's clock is weak; we have rephrased this sentence based on this comment, now citing Refs[12,13] from the referee report. Ref.[14] from the referee report was already cited in the original manuscript as Ref.[51] in the Discussion.

• However, the authors should include at this point why the specific methodology (elastic net) employed in the construction of the Horvath clock would contribute to this observation.

8. The discussion of “control properties" of CpGs is consistent with the Elastic Net picking those CpGs that work well together. Thus, the results regarding network identification and properties have ignored this and the limited CpGs this has been exacted from e.g. Results (line 112). Why were not all the 850,000 CpGs from the EPIC array analysed in the network analysis rather than just 353? Conclusion statements regarding how a “network approach can bring new insight into methylation-related studies, providing a very interesting direction

for further research" (Line 389) are clearly limited when restricted to only these 353 CpGs and known biology not taken into account.

Analysing 850k (new EPIC array) or even 27k CpGs (older methylation array) is unfortunately not feasible computationally, due to the combinatorial explosion of the all-to-all nature of our analysis. This was the main reason why we have used only this limited set. In the updated version we call the readers' attention to this limitation. The network we analysed can be viewed as a small sub-graph from the several orders of magnitude larger system of the whole methylome. A relevant related question is how do the interesting hierarchical and control properties we observed change when we scale up the network size? During the review process as a first step we have repeated our analysis on a network roughly 10 times larger obtained as follows. We took the 353 CpG dinucleotides in Horvath's clock one by one as a response variable, and carried Lasso regressions on the whole 450K CpG array appearing in the input data, where we marked the regressors (CpGs) obtaining a non-zero coefficient at least once. These marked CpGs along with the 353 CpGs in Horvath's clock defined an extended set of nodes, counting altogether 2036 CpGs. Among this larger set of nodes, the links were obtained based on LassoCV regression, following the network construction method described in the paper. We thresholded the links based on the absolute value of the regression coefficients to ensure that the average degree of the extended network becomes the same as in case of the original network studied in the paper. The results of the hierarchy analysis on this extend network are shown in Fig.1. As we can see, this network is again significantly more hierarchical compared to its random configuration model counterparts, similarly to the original network studied in the paper. Furthermore, the outcome of the control centrality analysis, shown in Fig.2., was also resembling to results we obtained for the network based solely on Horvath's clock.

• It would be good to demonstrate these in others clocks such as Hannan et al, GrimAge, SkinBlood etc to reinforce the biology.

9. The authors need to explain and understand more precisely what the concept of `biological age' and predicators of this represent [15]. The initial Horvath clock was devised as an attempt at a `pan-tissue' clock (which it was highly successful in although caveats remain [16,17]). It is in fact a `composite' clock [3] capturing both forensic and biological age but neither perfectly. The authors need to understand and integrate the current knowledge and issues regarding DNA methylation clocks - as discussed recently by the epigenomics community [4].

We revised the part in the Introduction mentioning the 'biological age' according to Refs.[3,15] in the referee report, which are now also cited in the manuscript. In addition, beside the success of Horvath's clock, we now mention the existence of related caveats together with citing Refs[16,17] from the referee report. Finally, key challenges and issues discussed in Ref.[4] of the referee report are also listed in the revised version (together with a citation to the paper).

• Good to acknowledge and expand on this important point.

10. The statements regarding “Modifying the predicted age by perturbing the methylation network” need to be put in the context that they are interpreting a `biomarker' of biological ageing.

We have checked that we always refer to the adjustment of the "estimated" or "predicted" age and not true biological age. As indicated in the answers for other questions, we have put caveats concerning the interpretation both into the Introduction and Discussion.

• Good to include these caveats.

11. Unclear what “more aligned with the 'natural direction of ageing'." (Line 283) means biologically?

The methylation values can be considered as coordinates of a multidimensional vector space. E.g. if we consider the 353 CpGs it will be a 353 dimensional space. Each patient's methylation measurement is a point in this space. Since methylation values are not random, the points do not cover the whole space, rather they are constrained to a (potentially curved) subspace. Projection techniques like the linear PCA or the recently popular non-linear t-SNE can reveal the most extended directions and are widely used to visualise the most important features of a high-dimensional data set. The principal directions can often be interpreted as biological features. For example, the regression techniques used for age estimation identify such linear subspace. Changing few methylation values would move points according to the vector span by the linear combination of the corresponding axes, but the resulting position of the point may not necessarily stay on the "biologically allowed" subspace. As methylation values are part of an interacting network, change of one value cannot happen in isolation. In this part of the paper we describe this and show that by taking into account the cascading changes on our control network lead to changes that keep the points on the "biologically allowed" subspace in contrast to isolated (without following control cascades) changes that move points away from the subspace.

• The construction by elastic net will accentuation this interconnectedness.

12. In the Discussion the statement `Horvath's clock is showing non-trivial hierarchical and control properties' how is this unexpected? Furthermore, how would that be different from a random selection of array-derived CpG probes?

In this study we represent the system of CpG dinucleotides as a network, and although we do not expect this to behave as e.g., an Erdos-Renyi random graph, still, the non-trivial nature of the interrelations can in principle be manifested in several different ways. E.g., a network can be different from a random graph in terms of its degree distribution, can display a community structure (that is absent in random graphs), may show assortativity or disassortativity, etc. In our view, it is not straightforward that a network ought to have a hierarchic structure (accompanied by interesting control properties) just because it represents biological data. When considering a random baseline for comparison, we have to take into account that hierarchy measures are quite sensitive to the overall link density in networks. Based on that, we have chosen the configuration network ensemble to serve as the baseline, where the random graphs correspond to uniformly drawn samples from all possible graphs with the same degree sequence as the original network, as mentioned in the Results section related to Fig.2. In this way we cancel out any possible uncertainty in the GRC coming from either a change in

the overall link density or from a difference in the degree distribution. Selecting random CpG probes is a very interesting idea, however, we would leave this to be the subject of further study, where also the size of the examined network might be increased (the first preliminary results of this analysis are described in the answer to Major comment no.8). Nevertheless, based on the results we have seen for the network of Horvath's clock, we expect both the entire network between all CpGs and randomly chosen sub-graphs from this to display hierarchical properties.

• A randomisation as well as analysis of the other clocks would help consolidate interpretation.

13. The statements regarding the functional implications of individual CpGs in the Discussion need to be more clearly caveated [8].

The description of the biological function of the genes was moved to the appendix (also because another referee found this part too long) and caveats were added.

• Good

14. In the Conclusion (line 374) the statement “substantially more hierarchical compared to a random Graph" does not take into consideration the biological nature of these data.

The concept of hierarchy in this work was introduced from a network theoretic point of view, e.g., the hierarchy measure we apply was used in social and technological networks as well in the literature. The random graph ensemble serving as a baseline preserves the degree distribution of the original network, thus, the most fundamental component of the network structure is not affected by the randomisation. In this light, the observation of a significantly higher GRC value in the original network compared to the random ensemble is already interesting from a pure network theoretic point of view. Nevertheless, we believe that this can be interesting for biologists as well, as it shows a non-trivial wiring between the CpG dinucleotides, where we can reach the majority of the network from nodes at the top of the hierarchy in just a few steps, whereas we cannot

from bottom nodes.

• The authors need to appreciate what these biological data represent to help explain why a hierarchical structure is observed.

Minor

• Good - all minor points have been corrected

Reviewer #2: What I concerned most is the biological significance and stability of the age-related methylation hierarch relationships as described in this study. However, the authors still focused only on the 353 CpGs involved in Horvath’s clock, which is far from sufficient to represent the age-related CpG sites. More seriously, the authors still focused on and analyzed only one methylation dataset, this is far from enough to reach the conclusion: ‘the methylation hierarch relationships really happen during ageing’. Although the authors stated that ‘Analysing 850k (new EPIC array) or even 27k CpGs (older methylation array) is unfortunately not feasible computationally, due to the combinatorial explosion of the all-to-all nature of our analysis’, this explanation is unacceptable for me. In any case, more independent methylation dataset (e.g., 450K) to replicate their results is indispensable. Overall, the stability of the findings is doubtful which largely devalues the work.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions, please see http://journals.plos.org/compbiol/s/submission-guidelines#loc-materials-and-methods

Attachment

Submitted filename: pall_rereview_v1.pdf

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009327.r005

Decision Letter 2

Ilya Ioshikhes

20 Apr 2021

Dear Dr. Pollner,

Thank you very much for submitting your manuscript "Hierarchy and control of ageing-related methylation networks" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

In particular, Reviewer 2 again voiced substantial and very important critiques that must be addressed to this Reviewer's satisfaction, in order for your paper to be processed further.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Ilya Ioshikhes

Deputy Editor

PLOS Computational Biology

Douglas Lauffenburger

Deputy Editor

PLOS Computational Biology

***********************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Please see attachment

---

- Pollner et al. have now provided a second round of responses to reviewers’ queries, as below.

- The authors have, commendably, performed extensive additional work, including further analysis of other epigenetic ageing clocks, random CpGs, and they have tempered statements regarding causation of these clock-related DNA methylation changes.

Response to Reviewer 1:

1. We hope that the amendments made in the previous and current version regarding points 1., 2., and 5., related to the relevant biological processes connected to methylation profile changes are now satisfactory.

2. We have made further changes to this part to avoid the possible interpretation that methylation level changes are the cause of ageing-related changes. In the current version we mention first the changes in cell proportion, including the age-related myeloid skew, T cell exhaustion, polycomb target hypermethylation, bivalent domain hypermethylation, which lead to coordinated modifications of the entire methylome, that in turn can be also interpreted as a network of age-related change in the methylation levels of the CpGs.

3. We thank the referee for accepting our response.

4. We have removed the term 'clock CpG' entirely from the paper.

5. We thank the referee for accepting our response.

6. We thank the referee for accepting our response.

7. In the new version we now explicitly mention when introducing Eq.(1) that Horvath's clock is based on elastic net regression. Furthermore, after the sentence referred in this point we inserted a short description of the parameter selection in the elastic net approach.

8. We have applied our framework to the Skin-Blood clock and Hannum's clock, receiving very similar results as already shown for Horvath's clock. The corresponding Figures have been placed into the Supporting Information (Sect.S2, Figs.S2-S7.) now accompanying the paper. The analysis for both of these further epigenetic clocks has shown that the methylation network composed of their CpGs is hierarchical, where the control centrality of the nodes is in positive correlation with the position of the nodes in the hierarchy. In addition, the chance to achieve a larger expected change in the estimated age when perturbing the methylation levels seemed to be higher for nodes close to the top of the hierarchy with large control centrality. These results are in very clear analogy with the results we discuss for Horvath's clock in the main text.

9. We thank the referee for accepting our response.

10. We thank the referee for accepting our response.

11. We have added a short reminder for the readers about the fact that CpGs in Horvath's clock were selected using elastic net regression.

12. Beside the analysis of the Skin-Blood clock and Hannum's clock we have also studied methylation networks composed of randomly chosen CpGs with a fixed size equal to that of Horvath's clock (353 nodes). The corresponding results are presented in the Supporting Information (Sect.S3, Figs.S8-S9.), indicating that these networks display quite similar properties compared to the previously studied networks representing epigenetic clocks. On the one hand, the hierarchy measure (the GRC) in their link randomised counterparts is on average lower compared to the GRC value of the original network structure encoding the inferred relations between the methylation levels. On the other hand, the control centrality of the nodes is in positive correlation with their position in the hierarchy. When comparing the GRC value obtained for Horvath's clock with the GRC distribution of the random methylation network we can observe that the hierarchy measure for Horvath's clock is above the average at all studied m-parameters. However, its value is not an outlier, in the units of the standard deviation _ of the random distribution the difference is roughly between 1 � and 2 �, depending on m. Thus, the methylation network of Horvath's clock is resembling a methylation network with random CpGs where the hierarchy of the system is somewhat larger than the average, but not outstandingly large. By putting together the results obtained for networks representing epigenetic clocks and for the networks based on CpGs chosen uniformly at random from the 450k methylation array, we can conclude that basically any methylation network constructed according to our framework can be expected to display a hierarchical structure accompanied by control centrality values in positive correlation with the node position in the hierarchy. A remaining question of interest whether the hierarchy rankings obtained for small networks have an indicative value for the importance of the nodes we would observe in larger methylation networks where the size of set of CpGs taken into consideration is extended. Relating to that we have also examined mixed networks, where 10% of the CpGs were from the top of the hierarchy of Horvath's clock, and the rest of the nodes were chosen at random. The results (Fig.8. in the new version of the submission) show that hierarchy positions are conserved to a considerable extent across the different networks. This is promising for possible future research where the structure of larger parts from the methylome may be studied in small fragments analysed in a parallel fashion.

13. We thank the referee for accepting our response.

14. In regard to this point, the new version of the manuscript now mentions the relevant ageing related effects that are known to lead to coordinated methylation changes at this part of the Discussion.

- All my concerns have been sufficiently answered

Reviewer #2: In the new revision, the authors attempted to confirm the stability of the hierarchy of ageing-related networks by considering the CpG sites in Skin-blood clock and Hannum’s clock. However, these analyses are still performed by using only one Human Methylation 450K dataset (GSE40279) even the reviewer has already pointed out last time. This strategy does not support the view that the finding for the CpG sites of concern is general. As the authors explained -- “Analysing 850k (new EPIC array) or even 27k CpGs (older methylation array) is unfortunately not feasible computationally, due to the combinatorial explosion of the all-to-all nature of our analysis” or “Scaling up the network size in our analysis to the level of the whole 450k array would take more than 150 years; thus, it is truly not feasible”, but it means that the analysis of CpG sites in ‘Epigenetic Clock’ (e.g., Horvath’s clock) in another independent methylation dataset (e.g., GSE55763 that contains over 2600 samples) is theoretically possible and worthy, even perhaps I underestimated the difficulty. Overall, I have to say the current revision is still inconclusive.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: None

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Attachment

Submitted filename: Pollner et al_2ndRound.pdf

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009327.r007

Decision Letter 3

Ilya Ioshikhes

5 Aug 2021

Dear Dr. Pollner,

We are pleased to inform you that your manuscript 'Hierarchy and control of ageing-related methylation networks' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Ilya Ioshikhes

Deputy Editor

PLOS Computational Biology

Douglas Lauffenburger

Deputy Editor

PLOS Computational Biology

***********************************************************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #2: In the revised manuscript, the authors have explored the findings in an independent methylation dataset. They described the differences of the results between the two datasets and discussed the possible reasons. Overall, the study has some scientific significance, although the evidence is still not very strong. In this regard, the authors shall lower down their tone on some statements. I have no further comments.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: None

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009327.r008

Acceptance letter

Ilya Ioshikhes

3 Sep 2021

PCOMPBIOL-D-20-00368R3

Hierarchy and control of ageing-related methylation networks

Dear Dr Pollner,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Andrea Szabo

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Text

    Figure A: “Expected changes in the estimated age” Figure B: “Hierarchy of the methylation network based on the Skin-Blood clock” Figure C: “Hierarchy of the methylation network based on the Hannum’s clock” Figure D: “Control centrality and reach in the network defined based on the Skin-Blood clock” Figure E: “Control centrality and reach in the network defined based on the Hannum’s clock” Figure F: Scatter plot of expected change in the estimated age as a function of the the m-reach and the control centrality for the network based on the Skin-Blood clock. Figure G: Scatter plot of expected change in the estimated age as a function of the the m-reach and the control centrality for the network based on the Hannum’s clock. Figure H: Hierarchy of the methylation networks based on randomly chosen CpGs. Figure I: Control centrality and reach in methylation networks based on randomly chosen CpGs. Figure J: Hierarchy of the methylation networks based on the data set studied by Lehne et. al. Figure K: Control centrality and reach in methylation networks based on the data set studied by Lehne et. al. Figure L: Scatter plot of the expected change in the estimated age as a function of the the m-reach and the control centrality for the network based on the data studied by Lehne et. al. Figure M: Correlation between the m-reach obtained in the networks based on the Hannum et. al and on the Lehne et al. dataset. Figure N: Distribution of the top 10% of the CpGs according to the Lehne et al. hierarchy in the hierarchy based on the Hannum et al. network. Figure O: Distribution of the top 10% of the CpGs according to the Hannum et al. hierarchy in the hierarchy based on the Lehne et al. network. Figure P: Age distribution of the patients in the studied data sets.

    (PDF)

    Attachment

    Submitted filename: Palla et al.pdf

    Attachment

    Submitted filename: AnswersToReferees.pdf

    Attachment

    Submitted filename: pall_rereview_v1.pdf

    Attachment

    Submitted filename: rebuttal_letter_2.pdf

    Attachment

    Submitted filename: Pollner et al_2ndRound.pdf

    Attachment

    Submitted filename: rebuttal_letter_3.pdf

    Data Availability Statement

    The complete methylation profiles and the metadata we use is publicly available in NCBI’s Gene Expression Omnibus (GEO) under accession number GSE40279.


    Articles from PLoS Computational Biology are provided here courtesy of PLOS

    RESOURCES