Abstract
Transcriptional regulatory networks (TRNs) are enriched for certain “motifs.” Motif usage is commonly interpreted in adaptationist terms, i.e., that the optimal motif evolves. But certain motifs can also evolve more easily than others. Here, we computationally evolved TRNs to produce a pulse of an effector protein. Two well-known motifs, type 1 incoherent feed-forward loops (I1FFLs) and negative feedback loops (NFBLs), evolved as the primary solutions. The relative rates at which these two motifs evolve depend on selection conditions, but under all conditions, either motif achieves similar performance. I1FFLs generally evolve more often than NFBLs. Selection for a tall pulse favors NFBLs, while selection for a fast response favors I1FFLs. I1FFLs are more evolutionarily accessible early on, before the effector protein evolves high expression; when NFBLs subsequently evolve, they tend to do so from a conjugated I1FFL-NFBL genotype. In the empirical S. cerevisiae TRN, output genes of NFBLs had higher expression levels than those of I1FFLs. These results suggest that evolutionary accessibility, and not relative functionality, shapes which motifs evolve in TRNs, and does so as a function of the expression levels of particular genes.
Keywords: pulse generation, adaptationism, mutation-biased adaptation, gene regulatory network, transcriptional regulation
Introduction
The topology of transcriptional regulatory networks (TRNs) is enriched for certain motifs (Lee et al. 2002; Milo et al. 2002; Shen-Orr et al. 2002; Mangan and Alon 2003). Many argue that the reason a particular motif is seen in a particular setting is because that motif’s dynamical behavior is optimal for carrying out particular beneficial functions (Alon 2007). However, adaptationist claims about TRN organization have been accused of being just-so stories, with adaptationist hypotheses about optimality still in need of testing against an appropriate null model of network evolution (Wagner 2003; Artzy-Randrup et al. 2004; Mazurie et al. 2005; Kuo et al. 2006; Solé and Valverde 2006; Lynch 2007; Knabe et al. 2008; Jenkins and Stekel 2010; Tsuda and Kawata 2010; Widder et al. 2012; Ruths and Nakhleh 2013; Payne and Wagner 2015). Note that such null models do not describe neutral evolution, but rather evolution under more generic selection that does include selection for the specific, hypothesized motif “function.” We recently generated such a null model and used it to show that coherent type 1 feed-forward loops can, as hypothesized, evolve in response to selection specifically to filter out short spurious signals, by combining a fast signaling pathway and a slow signaling pathway with an AND gate (Xiong et al. 2019). Testing the hypothesis in this way was not merely confirmatory, but generated other insights about the existence and nature of alternative solutions, especially when slow transcriptional regulation is combined with faster response mechanisms such as post-translational regulation (Xiong et al. 2019). Other network motifs and properties have not yet received similar treatment.
At least three different motifs (Figure 1A) are all capable of producing a sharp pulse of expression in response to an increase in input signal (Figure 1B) (Basu et al. 2004; Camas et al. 2006; Çağatay et al. 2009). All depend on an effector first being rapidly activated by a signal, and later, at a slower timescale, being repressed by it. These three motifs are simple auto-repression (AR), negative feed-back loops (NFBLs), and incoherent type 1 feed-forward loops (I1FFLs) (Figure 1A). The three motifs are topologically and functionally similar to each other, differing in whether the slow repression is effected via negative autoregulation by the effector R of itself, via negative feedback regulation of R using a specialized repressor, or via a separate negative control pathway from the input to the repressor and then the effector.
Figure 1.
Three motifs (I1FFL, NFBL, and AR) all produce a pulse of effector E expression in response to increased signal S. (A) In all three cases, rapid and direct activation of the effector by the signal is eventually countered by a slower path of repression. The three motifs differ topologically in whether repression is by the effector itself (AR), by a specialized repressor (R) that is activated by the signal (I1FFL), or by a specialized repressor that is activated by the effector (NFBL). Regular arrow tips represent activation and ⊣ represents repression. (B) With appropriate parameters, and with a delay between transcriptional activation and protein production in the case of AR, all three motifs can induce a pulse, as the initial increase in expression as S activates E is eventually tamped down by a path of repression.
The high prevalence of I1FFLs and NFBLs in TRNs has been interpreted to occur because these two motifs are adaptations for pulse generation and closely related functions (Shoval and Alon 2010; Shoval et al. 2010; Ferrell 2016; Shi et al. 2017). Both I1FFLs and NFBLs allow the steady-state level of the effector, before and after the pulse, to be independent of the signal strength, a property known as chemical adaptation (Ferrell 2016; Shi et al. 2017). We note that AR is normally hypothesized to perform functions other than pulse generation (Wall et al. 2004; Alon 2007), but theoretical analysis and experiments show that AR can generate pulses (Rosenfeld et al. 2002; Camas et al. 2006; Amit et al. 2007). We, therefore, include AR for the completeness of the study, while focusing on I1FFLs and NFBLs.
Which of the motifs is likely to evolve is often explained by demands for specific properties of the pulse. For example, although both I1FFLs and NFBLs allow the amplitude of the pulse to be a function of the fold-change of the signal’s strength (Shoval et al. 2010), they do so with different functional forms (Adler et al. 2014). I1FFLs and NFBLs can also differ in their ability to filter noise in the signal (Buzi and Khammash 2016).
Alternative causes might be responsible for differences in the occurrence of the three motifs. An important consideration is that fitness landscapes tend to have many alternative local endpoints, which might take the form either of peaks (Whitlock et al. 1995) or of plateaus (van Nimwegen and Crutchfield 2000). Factors such as expression levels can change the relative accessibility of different local evolutionary endpoints, in ways that are independent of differences in their heights. Evolutionary accessibility encompasses both which mutations occur in a single step and which hill-climbing multi-step paths are possible. The emphasis of the evolutionary accessibility hypothesis on process is in contrast to optimality explanations that consider only the final evolutionary outcome. Whether evolutionary accessibility is a plausible cause of enrichment for particular motif is a question that in silico evolution is ideally set up to explore. We note that I1FFLs and NFBLs differ by whether it is the signal or the effector that regulates the repressor (Figure 1A). Intuitively, the relative ease of evolving these two possible regulatory interactions with the repressor could depend on the relative expression levels of the candidate regulators.
Here, we simulate TRN evolution under selection to produce a pulse, and test how subtle differences between scenarios might alter both the maximal performance and the evolutionary accessibility of motifs. In particular, a highly expressed effector is more able to stimulate its repressor, and we therefore predict that this scenario should be more likely to evolve regulation via an NFBL and correspondingly less likely to evolve an I1FFL. Our simulations reject adaptationist explanations—I1FFLs and NFBLs achieve similar fitness—and confirm that NFBLs are evolutionarily more accessible than I1FFLs under this scenario, but that I1FFLs are more accessible under other scenarios where a highly expressed effector is not required. Data from real-world yeast TRNs agree with model predictions, showing that the effectors of NFBLs generally have higher expression levels than those of I1FFLs.
Materials and methods
Transcriptional regulation
Transcription factors (TFs) bind to a given TF binding site (TFBS) according to a formula based on the biophysics of the matching of the cis-regulatory sequence to the TF’s consensus binding sequence (see Supplementary materials/TF binding for details). Briefly, in isolation from all other TFs and TFBSs, the probability Pb that a TFBS is occupied is
where C is the total concentration of the TF and is a version of the binding affinity Kd of the TF, rescaled to account for the fact that focal TFBSs must compete for the TF with many nonspecific binding sites throughout the genome. From probabilities of this form, we calculate the probability that exactly A activators and R repressors are bound to a given cis-regulatory sequence, given the possibility of physical overlap among TFBSs (see Supplementary materials/TF occupancy). From this, we derive four probabilities that we assume determine gene expression: (1) the probability PA of having at least one activator bound to a gene, (2) the probability PR of having at least one repressor bound, (3) the probability PA_no_R of having no repressors and one activators bound, and (4) the probability PnotA_no_R of having no TFs bound.
We model transcriptional initiation as a two-step process whose rates depend on TF binding, and parameterize those rates with reference to nucleosome disassembly followed by transcription machinery assembly (Mao et al. 2010; Brown et al. 2013). We model a repressed state of nucleosome presence, an intermediate state of a nucleosome-free transcription start site that lacks transcription machinery, and an active state from which transcription takes place. The transition rates between states (except for the rate rAct_to_Int from the active state to the intermediate state) are linear functions of PA, PR, PA_no_R, and/or PnotA_no_R, and are parameterized from empirical estimates of the maximum and minimum rates (see Supplementary materials/Transcriptional regulation for details). We let rAct_to_Int evolve (see Model overview), the distribution of its initial value and that of Kd, which is also evolvable, is summarized in Supplementary Table S1. The maximum and minimum values of the other transition rates are constant, and are summarized in Supplementary Table S2.
Fitness
Our simulations of gene expression begin with a burn-in phase of random length, to ensure that TRNs respond to a change in the signal, rather than evolve a timer mechanism. The level of signal is low during stage one and burn-in, which last for 120 + x minutes, where x is a random number drawn from an exponential distribution truncated at 30, and with an un-truncated average of 10. Fitness is assessed only on the basis of stage two, which lasts for 240 min, plus the last 5 min of stage one. We sample the effector concentration at one-minute intervals. The highest effector concentration during stage two is denoted .
The positive fitness of a TRN has four components: the peak level of effector, a low effector expression starting point, the speed with which effector expression rises, and the speed with which it falls. Together, these four components capture the core attributes of what it means to be a pulse, and in combination, they apply consistent selective pressure first to generate any pulse at all and later to produce a superior pulse. All four positive fitness components are based on the expression level of the effector. For the purpose of scoring effector concentration and hence fitness, we use the total protein level of all effector proteins, including those that have diverged, following duplication, to have different regulatory activities. At each point in the simulation, gene expression also incurs a cost that is proportional to the total rate of translation of all genes (see Supplementary materials/Cost of gene expression). The estimated fitness of a TRN from one gene expression simulation is the arithmetic mean of the four positive fitness components minus the cumulative cost of gene expression throughout the last 360 min of a simulation of gene expression. Fitness parameters are summarized in Supplementary Table S3.
Fitness component one scores the match to a pre-defined peak effector expression level:
| (1) |
We set the optimal peak expression level popt to 5000 molecules per cell, 10,000 molecules per cell, or 20,000 molecules per cell, corresponding to selection for a low, medium, or high peak level, respectively. Under the assumption that the effector is a metabolism-related protein, we chose the number 10,000 based on the average number of PDC1 protein molecules per yeast cell (Ghaemmaghami et al. 2003). The effector also acts as TF; this kind of dual functionality is not uncommon in yeast (Gancedo and Flores 2008). We set r so that when P = 0.5popt, f1 = 0.5.
We set fitness component two to reward low effector expression at the end of stage one:
| (2) |
where s1 is the arithmetic mean of the effector level across the last 5 min of stage one. This is chosen as a simple piecewise-linear function, which plateaus at a maximum of 1 for values of s1 below 10% of the peak level p.
We set fitness component three to reward rapid turn-on of effector:
| (3) |
where thalf_peak is the latest time in stage 2 at which the effector level is at 0.5(s1 + p) before the effector hits its peak, and tsaturate sets a time for which making the effector response still more rapid no longer increases fitness. We set tsaturate to 60 min by default, and also explore the outcomes of varying this parameter in the Supplementary materials.
To select for the downward slope of a pulse, fitness component four rewards falls in the effector to no more than 80% of the peak level by the end of stage two:
| (4) |
where s4 is the arithmetic mean of the effector level across the last 5 min of stage two. Again, we chose a piecewise-linear function. We chose the relatively high value of 80% in order to select for an inclusive category of pulses. We consider pulses that eventually return all the way down to the level that prevailed before the signal (i.e., biochemical adaptation) to be a special case.
In some simulations of gene expression, we observed a peak expression level that is smaller than or equal to the effector’s expression level observed right before the signal increases, or is even 0. In neither of these cases do fitness components Equation (2) and/or Equation (4) provide a useful selection gradient toward the evolution of a pulse. For simplicity, we set the fitness of these two cases to zero.
Evolution
We calculate the arithmetic mean fitness fresident of the current (“resident”) TRN across 1,000 replicate simulations of gene expression, and the arithmetic mean fitness fmutant of the mutant across 200 replicate simulations of gene expression. If fmutant satisfies
| (5) |
we replace the resident TRN with the mutant, and re-calculate the fitness of the new resident TRN to higher resolution using an additional 800 replicate simulations of gene expression.
Because gene expression is stochastic in our simulations, estimated fitness varies among replicates, and is subject to error even after averaging across many replicates. This means that our algorithm allows neutral or slightly deleterious mutations to fix. This is sometimes even explicit; the updated fitness that includes 800 additional simulations of the successful mutant can be lower than the fitness of the TRN it replaced. We discuss the details and rationale of our evolutionary simulation in Supplementary materials/Evolutionary simulation.
Counting network motifs
We count I1FFLs and NFBLs formed by the signal, an effector gene, and a repressor gene that is different from the effector gene, with interactions between them as shown in Figure 1A. We count ARs formed by the signal and an effector gene. We score gene A as potentially regulating gene B, if there is a TFBS for A in the cis-regulatory sequence of B. We allow genes in I1FFLs and NFBLs to self-regulate. An overlapping I1FFL in which the effector and the auxiliary TF repress each other is counted not as two I1FFLs, but rather as a different type of network motif. Overlapping I1FFLs evolve rarely.
Given that two mismatches to an 8-bp consensus sequence still yield above-background binding (Supplementary materials/TF binding), a random 8-bp sequence qualifies as a weak affinity TFBS with probability . Each cis-regulatory sequence contains around 300 8-bp potential binding sites (including both orientations of a 150 bp cis-regulatory sequence), among which 1.14 will on average qualify by chance as a two-mismatch TFBS for a given TF. These two-mismatch TFBSs, occurring so often by chance, usually have low affinity, and therefore might have little regulatory effect. It is for this reason we refer to them as potential regulatory interactions—our previous work has shown that motifs can appear more clearly when weak affinity TFBSs with little regulatory effect are excluded (Xiong et al. 2019). Four types of spurious two-mismatch TFBSs can create apparent but nonfunctional I1FFLs and NFBLs: S → TF, E → TF, TF → E, and E → E (Supplementary Figure S1), where “TF” refers here to a TF that is not an effector. Because it is computationally expensive to test whether each two-mismatch TFBS is spurious, we instead tested all cases at a time for each of the four types listed above. Specifically, we recalculate the fitness of the TRN while ignoring all 2-mismatch TFBSs of that type, across 1,000 gene expression simulations, and deem the entire set of TFBSs spurious if the recalculated fitness is at least 99% of the original fitness (see Figure 2 legend for variations on this criterion). We ignore spurious connections while scoring network motifs.
Figure 2.
Selection for high peak effector expression levels promotes NFBLs. (A) TRNs are evolved under selection to generate pulses in response to an input signal. Under all three selection conditions, the input signal starts with 100 molecules per cell and increases to 1,000 molecules per cell to trigger a pulse. Three versions of the evolutionary simulations select for three different optimal peak effector levels of the effector: low (Popt = 5,000 molecules per cell), medium (Popt = 10,000 molecules per cell), and high (Popt = 20,000 molecules per cell). For each high-fitness genotype (Supplementary Figure S3), we calculate the proportion of evolutionary steps that contain at least one network motif of the specified type among the last 10,000 evolutionary steps (out of a total of 50,000 evolutionary steps). When scoring for motifs, nonfunctional spurious TFBSs were excluded (see Materials and Methods for details, and Supplementary Figure S5 for results using different TFBSs exclusion criteria). R can be auto-regulating (not shown in circuit diagram). On rare occasions, AR co-occurred with I1FFLs or overlapping I1FFLs (labeled here I1FFL + I1FFL) (Supplementary Figure S6), and these few cases were included in the scoring of I1FFL and overlapping I1FFL frequencies. (B) Preventing either NFBLs or I1FFLs from evolving does not lower the final fitness within high-fitness evolutionary simulations. Instead, genotypes obtained equally high fitness by evolving the other common motif (Supplementary Figure S7). To prevent NFBLs from evolving, we remove the TF binding activity of effectors; this also prevents the evolution of the AR auto-repression motif. To prevent I1FFLs from evolving, we ignore TFBSs for the signal in the cis-regulatory sequence of any repressors. Because this might have unintended consequences for mutations that convert repressors to/from activators, we set to zero the rate of mutations that effect this conversion. Data are shown as mean ± SE over replicates.
Mutations that create and destroy motifs
For each evolutionary replicate, we identified the evolutionary steps at which the number of instances of a given motif changes to or from zero, which we call “motif-destroying-mutations” and “motif-creating-mutations”, respectively. We removed spurious TFBSs before scoring motifs, as described in the last section, with one modification: to save computation related to mutations that were trialed and then rejected by selection, we used only 200 gene expression simulations to determine fitness without the TFBS in question, with a threshold of 98% of original fitness. Mutations that change the expression levels of a gene and/or the binding affinity of a TF can potentially change whether a two-mismatch TFBS is “spurious” in terms of fitness effects, effectively rewiring the TRNs even if they do not create or destroy core TFBSs of the motif in question.
Expression levels of TFs in yeast TRN
We used YeastMine (Balakrishnan et al. 2012) to retrieve TFs in S. cerevisiae and searched Yeastract (Teixeira et al. 2006) for interactions among these TFs. We compiled 203 yeast TFs and identified 46 NFBLs, 30 I1FFLs, and 7 I1FFL-NFBL conjugates from them. This includes only the 30 I1FFLs whose effectors are activators; another 33 I1FFLs with repressors as effectors were excluded, for better comparison with NFBLs, in which the effectors must be activators. See Supplementary materials/Searching for network motifs in yeast TRNs for more details.
To assess peak expression level, we used the data of Gasch et al. (2000), who applied multiple stimuli to yeast and measured the fold-change in RNA expression of all genes relative to pre-stimulus expression levels. We analyzed data on exposure to 10 stimuli: amino acid starvation, nitrogen depletion, sorbitol osmotic shock, temperature shift from 25°C to 37°C, diamide, hydrogen peroxide, menadione disulfate, diauxic shift, dithiothreitol, and transition to a stationary phase of growth. Following each stimulus, fold-change was recorded over several time points. We consider an effector gene to exhibit pulse-like expression if the maximum fold-increase in expression occurs prior to the last time point and has a larger magnitude than that of the maximum fold-decrease in expression; we excluded gene-stimulus combinations that do not meet this criterion from further analysis. For input and repressor genes, we did not require a pulse, but merely that the stimulus led to increased expression (measured as average fold-change across time points), and that the maximum fold-increase was larger than the maximum fold-decrease. We excluded repressor-stimulus and input-stimulus combinations that failed to meet both criteria.
We note that the same gene can occupy the same position within multiple motifs. For example, GAT1 is the effector in 18 NFBLs and one I1FFL, suggesting that this gene might be particularly well-suited for function within NFBLs. To compare gene expression between I1FFLs and NFBLs, we weighted the fold-change in expression (or similarly, the time to reach the peak fold-change) of a given gene by the frequency with which that gene appears in the motif of interest, e.g., weights of 18/19 and 1/19 for GAT1’s appearance as an effector in NFBLs and I1FFLs, respectively. For I1FFL-NFBL conjugates, we assign half-weights to both I1FFLs and NFBLs.
We complemented this peak-RNA-expression analysis with an analysis of the average protein levels (i.e., not peak levels), taken from PaxDB (Wang et al. 2015). One analysis is restricted to a March 2013 data set originally compiled by PeptideAtlas (Desiere et al. 2006) to show the abundances of peptides in S. cerevisiae pooled across 90 experiments, which include normal growth conditions and perturbed growth conditions, e.g., cell cycle arrest and metabolic perturbation. We also used another data set “GPM, Aug, 2014” from PaxDB, which has more genes than “PeptideAtlas, March 2013” (5289 vs 4828). While we could not find a detailed description for this GPM (the Global Proteome Machine) (Craig et al. 2004) dataset, GPM generally includes data from PeptideAtlas (Craig et al. 2004), meaning that the GPM dataset similarly includes both normal growth conditions and perturbed growth conditions. Weighted average protein levels were calculated with the same weighting scheme as for fold-change of gene expression.
Statistical analysis
Student’s t-test was performed by using Matlab (R2020b). Jonckeere-Terpstra test was performed with R (4.1.0) package DescTools (0.99.42), with argument “alternative” set to “increasing” and “nperm” to 10,000. All reported P-values are not corrected.
Results
Model overview
We used a previously described computational model to simulate the expression of genes in a TRN, parameterized by available Saccharomyces cerevisiae data (Xiong et al. 2019). Supplementary Figure S2 summarizes the model, the model variables (which evolve to produce the TRN output) are summarized in Supplementary Table S1, and the parameters are summarized in Supplementary Tables S2–S4. The TRN evolves under a realistic mutational spectrum including de novo appearance of weak-affinity TFBSs, and frequent gene duplication and deletion. Briefly, each gene in the TRN encodes either an activating or repressing TF, and each is regulated by a 150-bp cis-regulatory sequence accessible to TF binding. Each TF recognizes an 8-bp consensus binding sequence with a characteristic binding affinity. Binding sites with up to two mismatches are still recognized, with each mismatch reducing binding affinity according to a thermodynamic model (Supplementary materials/TF Binding). TFs can bind in either orientation. Each TF that binds to DNA occupies three extra base pairs upstream and downstream of the consensus sequence, making a total of 14 bp inaccessible to other TFs. The concentrations of TFs are used to calculate the probabilities that each cis-regulatory region is bound by a given number of activators and repressors (see Materials and Methods).
To simulate gene expression, we assume that each gene transitions between an active chromatin state that can initiate transcription, an intermediate primed state capable of becoming either activated or repressed, and a repressed chromatin state. Most transition rates depend on whether activators and/or repressors are bound (see Materials and Methods), with the fastest transition rate to the active state occurring when at least one activator and no repressors are bound. The transcription initiation rates of mRNAs from active genes are gene-specific, and so are the degradation rates. Note that the above rates (including the transition rates between the states of genes) are expectations; exactly when a reaction (e.g., one of gene A’s mRNAs is degraded) happens is simulated stochastically using a Gillespie algorithm (Gillespie 1977). Conceptually, the algorithm allows one event to happen at a time, with the cellular state remaining unchanged between events. The waiting time between two events has an exponential distribution, with a mean specified by the total reaction rates. Once the time of an event is sampled, the algorithm randomly picks an event (e.g., degrading gene A’s mRNA) based on the reaction’s relative rate, and changes the cellular state according to the event (e.g., there is one less mRNA of gene A in cell). See Supplementary materials for details.
Each mRNA produces protein at a gene-specific translation rate. Once transcription is initiated, we simulate a delay before mRNA can be translated at full speed. The delay accounts for the completion of both transcription and the loading of ribosomes to mRNA, and is a function of gene length (Supplementary materials/Transcriptional delay and Translational delay). Because tracking the turnover of individual protein molecules with a Gillespie algorithm is computationally expensive, we calculate the turnover of proteins with ordinary differential equations (Supplementary materials/Simulation of gene expression).
To select for pulse generation, we designate an input signal to the TRN, which binds to cis-regulatory regions like any other TF, but whose concentration is set externally rather than being regulated by other TFs in the TRN. The input signal always activates gene expression. Signal concentration is low and constant during a burn-in phase, where genes are initialized with a repressed chromatin state, and begin with zero nonsignal mRNA and protein. Then in stage 2, the signal instantly switches to a high level, and selection is applied for a TF designated to be the “effector” to exhibit pulse-like expression. High fitness depends on having low effector expression at the end of stage 1, matching a pre-defined peak effector concentration during stage 2, rapidly increasing effector level after stage 2 begins, and having a low effector level at end of stage 2. Details of the signal and fitness calculation are given in the Materials and Methods.
We initialize an evolutionary simulation with a randomly generated genotype of 3 activator genes, 3 repressor genes, and an effector gene. The effector is initialized as an activator, which makes NFBLs more accessible than ARs (although below we will explore the effects of switching this). All quantitative gene-specific variables, such as transcriptional rates and gene length, are randomly initialized according to empirically estimated distributions (see Supplementary Table S1 and Supplementary materials).
We simulate five classes of mutations. Supplementary Table S4 lists the corresponding mutation rates and details of the parameterization are provided in the Supplementary materials. A class-one mutation is a duplication or deletion of one gene along with its cis-regulatory sequence. The maximum number of genes is capped at four effector genes plus 20 noneffector genes (excluding the signal) to limit computational cost. Once this limit is reached, no duplication mutations are allowed. In addition, once any give gene is present in four copies, none of the copies are duplicated until one is again lost by deletion. Neither the last effector gene nor the last two noneffector genes are subject to deletion. The signal is subject neither to duplication nor to deletion.
Class-two mutations are single nucleotide substitutions in the cis-regulatory sequences, which can cause TFBS turnover. Mutations change one nucleotide to one of the other three nucleotides with equal probabilities.
Class-three mutations change the quantitative values of gene-specific variables, i.e., the rate at which transcriptional bursts end, gene length, mRNA degradation rate, protein synthesis rate, protein degradation rate, and the affinity of a TF to DNA. All quantitative gene-specific variables except length are subjected to mutational bias, e.g., mutation tends to reduce the affinity of TF binding. In case this is insufficient to ensure the values of the variables never go beyond reasonable limits, we also apply hard bounds (see Supplementary materials/Mutations for details).
Class-four mutations convert transcription activators to repressors (or the reverse). This mutation does not apply to the input signal, i.e., the input signal is always an activator.
Class-five mutations change a single nucleotide preference in a TF’s consensus binding sequence. One of the other three nucleotides is chosen for the consensus binding sequence with equal probabilities.
When gene duplicates differ due only to class-three mutations, the duplicates are considered as “copies” of the same gene, encoding “protein variants.” Once a class-four or class-five mutation is applied to a gene duplicate, the duplicate becomes a new gene encoding a new protein. When scoring motifs, we require that each node be a different protein.
Evolution is simulated using the revised origin-fixation model introduced by Xiong et al. (2019). Briefly, the resident genotype experiences one mutation, chosen according to the relative rates of all possible mutations. The fitness of the original resident TRN and of the mutant TRN is calculated by simulating gene expression in response to an input signal (see Materials and Methods for details). If the estimated fitness of the mutant is sufficiently high (see Materials and Methods for details), the mutant replaces the resident genotype. Note that estimated fitnesses include stochasticity from the simulation of gene expression, which serves to introduce a form of genetic drift. If no replacement occurs, we generate a new mutant and repeat the procedure until a replacement is found. We call a replacement an evolutionary step, and end each simulation after 50,000 evolutionary steps. We use the average fitness of the last 10,000 evolutionary steps to determine whether evolution has found a good solution.
High peak expression level increases NFBLs’ evolutionary accessibility but not fitness
We evolve TRNs under selection to generate a pulse of effector expression in response to a sudden 10-fold increase in input. While any of the three network motifs can solve this challenge, a highly expressed effector is more capable of stimulating its repressor, and thus this solution should be more likely to evolve regulation via an NFBL and correspondingly less likely to evolve an I1FFL. Note that this prediction is expected both on grounds of which solution might be superior, and on grounds of which solution is easier for evolution to find.
To test this prediction, we vary the optimal peak level of the effector (see Materials and Methods for details of fitness function). In silico evolution from a random starting point is not always successful at reaching the target phenotype, so we focus on the most evolutionarily successful simulations. We do this by dividing evolutionary replicates into three categories based on final fitness (Supplementary Figure S3). See Supplementary Figure S4 for examples of the phenotypes of the high-fitness replicates.
High-fitness solutions rarely involve AR under any of the three selection conditions, while both I1FFLs and NFBLs evolve often (Figure 2A). As predicted, selection for higher effector expression increases the prevalence of NFBLs relative to I1FFLs (Figure 2A). These NFBLs were absent from medium-fitness solutions, which instead employed I1FFLs or ARs (Supplementary Figure S8A), generally achieving lower peak effector expression than in the high-fitness solutions (Supplementary Figure S8B). While this seems to suggest that NFBLs might be superior, if we prevent one type of motif from evolving, similarly high fitness genotypes can be obtained via the other motif (Figure 2B and Supplementary Figure S7). The reason we get more NFBLs and fewer I1FFLs with selection for higher peak effector expression is therefore not straightforward superiority of the former, but rather the relative ease of finding high-fitness solutions.
Computational constraints prevent us from varying all parameter values, or even all six parameter values related to the fitness function (Supplementary Table S3). However, in Supplementary Figure S9, we vary the parameter tsaturate, complementing our exploration of the parameter popt described above. Reducing tsaturate, i.e., selecting for a fast response of the effector to the signal, makes it difficult to evolve NFBLs even under selection for high peak levels, but does not make NFBLs functionally inferior to I1FFLs (Supplementary Figure S9). Thus, selection on response speed and on the peak expression levels of the effector both alter the evolutionary accessibility of the two motifs.
Early bias toward I1FFLs can shift to later NFBL evolution via I1FFL-NFBL conjugates
The combined frequency of the two motifs rises throughout the long period of evolution, rather than topological solutions being found early and becoming locked in and only incrementally improved on (Figure 3). However, the frequency of I1FFLs in particular rises prominently during the first 10,000 evolutionary steps (Figure 3), even under selection for high peak effector levels, i.e., selection that ultimately leads to an evolutionary preference for NFBLs (Figure 3C).
Figure 3.
Evolution of I1FFLs and NFBLs follow different trajectories. We score motif occurrence during different time periods along the way to the evolution of the high-fitness replicates shown in Figure 2A. See Supplementary Figure S10 for the occurrence of other motifs during evolution. As in Figure 2A, we calculated the proportion of evolutionary steps that contain at least one network motif of the specified type. Note that because some evolutionary replicates oscillate between motif presence and absence, given the potential for slightly deleterious mutations in our evolutionary algorithm, the fraction of evolutionary replicates that frequently show the motif in question is higher than the probability of presence in one evolutionary step as shown here. Data are shown as mean ± SE over replicates.
To further test this point, we made the early evolution of NFBLs less accessible by initializing the effector as a repressor. While this reduced the frequency of NFBLs even under selection for a high peak, those NFBLs that still evolved reached similar performance to I1FFLs (Supplementary Figure S11). This further supports early evolutionary accessibility as a key factor.
The relative ease of I1FFL evolution could be because more mutations create I1FFLs and/or because mutations creating I1FFLs have higher acceptance rates. To explore this further, we characterize the mutations that create I1FFLs and/or NFBLs in TRNs that do not currently contain such a motif. I1FFL-creating mutations occur at a higher rate than NFBL-creating mutations under selection for low-peak and medium-peak expression, while NFBL-creating mutations are more common under selection for high-peak expression (Table 1). The rarity of NFBL-creating mutations becomes much more pronounced when we restrict our analysis to mutations that do not also destroy or create another motif—this tendency holds even under conditions that favor NFBLs, i.e., late in evolution under selection for high-peak expression (Table 1). Greater mutational accessibility of the I1FFL motif is clearly one of the factors favoring this motif.
Table 1.
Summary of mutations that create I1FFLs and/or NFBLs
| Peak level | Evolutionary step 1–10,000 |
Evolutionary step 10,001–30,000 |
|||||||
|---|---|---|---|---|---|---|---|---|---|
| All mutations |
Nondisruptive mutations |
All mutations |
Nondisruptive mutations |
||||||
| Trialed | Acceptance rate | Trialed | Acceptance rate | Trialed | Acceptance rate | Trialed | Acceptance rate | ||
| Low | I1FFL-creating | 0.049 | 0.171 | 0.0038 | 0.606 | 0.077 | 0.142 | 0.0019 | 0.550 |
| NFBL-creating | 0.026 | 0.078 | 0.00074 | 0.544 | 0.024 | 0.062 | 0.00028 | 0.504 | |
| Medium | I1FFL-creating | 0.072 | 0.102 | 0.0031 | 0.607 | 0.088 | 0.083 | 0.00091 | 0.432 |
| NFBL-creating | 0.043 | 0.097 | 0.00084 | 0.694 | 0.049 | 0.078 | 0.00032 | 0.507 | |
| High | I1FFL-creating | 0.036 | 0.114 | 0.0017 | 0.546 | 0.039 | 0.063 | 0.00027 | 0.411 |
| NFBL-creating | 0.097 | 0.072 | 0.0017 | 0.642 | 0.147 | 0.069 | 0.00026 | 0.476 | |
We identify the accepted and rejected mutations that increase the number of I1FFLs and/or NFBLs in a TRN to above zero (see Materials and Methods for details). Among these mutations, “nondisruptive mutations” are those that create the given motif but do not otherwise alter the numbers of I1FFL (when NFBLs are created), NFBLs (when I1FFLs are created), I1FFL-NFBL conjugates, overlapping I1FFLs, and auto-repressors. For each selection condition and evolutionary stage, we pooled the qualified mutations from all high-fitness replicates shown in Figure 2A. The total numbers of mutations of the given type were normalized by dividing by the total number of mutations trialed in resident TRNs that did not already have the motif in question. The acceptance rate shown in the table is the number of accepted mutations across all replicates divided by the number of trialed mutations across all replicates. Pseudo-replication may be a concern here; if the initial TRN tends to create one motif over the other, this might be propagated at all subsequent time points for that evolutionary replicate. However, Supplementary Table S5 shows that the initial mutational bias of a TRN can flip at a later stage of evolution.
The early evolution of I1FFLs is also facilitated by the higher acceptance rate of I1FFL-creating mutations relative to NFBL-creating mutations, particularly during the first 10,000 evolutionary steps (Table 1). Similarly, I1FFL-destroying mutations are accepted less often than NFBL-destroying mutations are, in this case throughout the course of evolution and regardless of target peak expression (Supplementary Table S6). Note that mutations that create one motif frequently destroy another, with NFBL-creating mutations more prone to this problem than I1FFL-creating mutations (Table 2). While some such disruptive mutations are accepted by our evolutionary algorithm (Table 2), acceptance rates are higher for nondisruptive mutations (Table 1). If we restrict our analysis to nondisruptive mutations, we see stronger mutation bias toward I1FFLs, and more similar acceptance rates for I1FFLs vs NFBLs (Table 1). In other words, a shortage of nondisruptive NFBL-creating mutations is an obstacle to the evolution of NFBLs. NFBL-creating mutations that destroy I1FFL-NFBL conjugates are both more common and more likely to be accepted than NFBL-creating mutations that destroy I1FFLs (Table 2). This suggests that I1FFL-NFBL conjugates might be an important intermediate step in the evolution of NFBLs, rather than NFBLs evolving de novo. This makes sense; after early evolution of an I1FFL provides a partial solution to the selective challenge, the evolutionary path to an NFBL does not abandon that I1FFL solution, but instead passes through a combined I1FFL-NFBL intermediate. The evolutionary path from an early partial I1FFL solution might lead either to a superior I1FFL or to an NFBL, with the potential to achieve similarly high fitness in either case.
Table 2.
Most NFBL-creating mutations also destroy other motifs
|
Peak level |
Evolutionary step 1-10,000 |
Evolutionary step 10,001–30,000 |
|||||||
|---|---|---|---|---|---|---|---|---|---|
| Destroys I-N conjugates | Accept. rate | Destroys I or N | Accept. rate | Destroys I-N conjugates | Accept. rate | Destroys I or N | Accept. rate | ||
| Low | I1FFL-creating | 0.461 | 0.108 | 0.101 | 0.169 | 0.529 | 0.092 | 0.106 | 0.183 |
| NFBL-creating | 0.924 | 0.060 | 0.443 | 0.037 | 0.883 | 0.048 | 0.443 | 0.034 | |
| Medium | I1FFL-creating | 0.643 | 0.048 | 0.100 | 0.137 | 0.647 | 0.056 | 0.132 | 0.108 |
| NFBL-creating | 0.920 | 0.082 | 0.314 | 0.051 | 0.928 | 0.070 | 0.280 | 0.053 | |
| High | I1FFL-creating | 0.466 | 0.047 | 0.255 | 0.148 | 0.611 | 0.021 | 0.274 | 0.090 |
| NFBL-creating | 0.962 | 0.060 | 0.307 | 0.051 | 0.962 | 0.064 | 0.234 | 0.047 | |
A high fraction of trialed mutations that create a given motif also destroy I1FFL-NFBL conjugates, and many also destroy NFBLs (in the case of I1FFL-creating mutations) or I1FFLs (in the case of NFBL-creating mutations). Destructive mutations are accepted at significant rates. Qualified mutations are pooled across all evolutionary replicates. See Materials and Methods for details about the identification of mutations that create and/or destroy motifs.
Indeed, I1FFL-NFBL conjugates (and NFBLs) are also often converted by mutation into simple I1FFLs. However, under selection for high peak effector expression, the acceptance rate of such mutations decreases over evolutionary time (Table 2). Peak effector expression increases during evolution (Supplementary Figure S12); this could drive increased preference for the now more highly expressed effector rather than the signal to control the repressor. In medium-fitness evolutionary replicates, high peak effector expression is not achieved, and NFBLs rarely evolve (Supplementary Figure S8). By the same logic, we hypothesize that strengthening the input signal should promote I1FFLs even under selection for a high effector peak. This is indeed the case, with promotion in particular of the evolution of the I1FFL-NFBL conjugate (Supplementary Figure S13).
Highly expressed effectors tend to be regulated by NFBLs in yeast
Next, we tested our model predictions about when I1FFLs vs NFBLs tend to evolve—we consider our findings most robust when seen both in the perfectly observed but less realistic model and in the noisy but more realistic empirical data. We identified NFBLs, I1FFLs and I1FFL-NFBL conjugates in the TRN of S. cerevisiae, using Yeastract annotations of regulatory interactions between TFs (see Materials and Methods ). Using data from Gasch et al. (2000), we identified genes that display pulse-like expression in response to an environmental stimulus, and the peak heights of the pulses (measured as the fold-change of RNA expression levels relative to the expression level before the stimulus). In agreement with our model prediction that selection for peak levels will promote NFBLs, the effectors of NFBLs reach higher peaks than those of I1FFLs following stimulus (Figure 4A). Other model predictions were not empirically supported. While our model also predicts that selection for rapid pulse generation will promote I1FFLs over NFBLs, we do not observe significantly slower pulse generation by the NFBLs in the same experimental data (Supplementary Figure S15). The input signals of NFBLs increase their expression more in response to stimuli than do those of I1FFLs (Figure 4A), which also disagrees with our model’s prediction. We note that the 46 NFBLs in our dataset involve 26 unique genes as the input signal and 8 as the effector, while the 30 I1FFLs involve 14 signals and 9 effectors. This raises the possibility that the more diverse signal inputs of the NFBLs might contain more false-positive hits.
Figure 4.
Effector TFs in yeast NFBLs have higher expression than those in I1FFLs. (A) Peak height of pulses was measured as the maximum fold-increase in RNA expression in response to one of 10 stimuli (see Materials and Methods for details), for the subset of genes showing pulse-like RNA expression during the former in the RNA expression data of Gasch et al. (2000). (B) Average fold-change in signal and repressor RNA expression in response to stimuli, for the subset that showed an increase (see Materials and Methods). (C) Protein levels under normal conditions were taken from the “PeptideAtlas, March, 2013” dataset provided by PaxDB (Wang et al. 2015). A weaker result was obtained using a different dataset from PaxDB that includes a larger set of gene-environment combinations (Supplementary Figure S14). For fold-change in expression, data are shown as mean ± SE over each network position across all instances of the motif. The procedure is similar for protein abundance, except the data is first log transformed. For each motif, we list the numbers n of unique gene-stimulus combinations where pulse-like expression is observed at a signal node (S), effector node E, or repressor node R. P-values come from two-tailed t-tests.
We also analyzed yeast protein expression levels from PaxDB, averaged across multiple environmental conditions rather than measured in response to stimuli (see Materials and Methods). We found that effector TFs generally have higher expression in NFBLs than in I1FFLs (Figure 4B). Note that the direction of causation is not known from the empirical data alone: when an effector already has high expression this might prompt the evolution of NFBL, or the presence of an NFBL might facilitate the evolution of high effector expression. The theoretical work presented here presents nonexclusive proof of principle in support of the former interpretation.
Discussion
We selected for a pulse generator in an evolutionary simulation model and observed which TRN motifs emerged. In the absence of strong selection for response speed, the critical factor is peak expression level: selecting for a high peak expression level of the effector promotes NFBLs over I1FFLs, while a strong input signal promotes I1FFLs. When a fast response to the input signal is required, I1FFLs are always the dominant solution, although selecting for high expression level still raises the proportion of NFBLs. However, regardless of the specific selection condition, if one motif is prevented from evolving, the other motif can evolve to take its place, with no loss of peak fitness. This suggests that the preference between motifs is about differences in evolutionary accessibility, not about which motif is optimal for generating a pulse. One pattern that was predicted by our model is confirmed in the actual TRN of S. cerevisiae, where effectors expression levels are higher in NFBLs than in I1FFLs.
Our results do not contradict previous reports that differences exist in dynamic properties between the two motifs. For example, I1FFLs but not NFBLs allow the relative increase between the peak and steady-state expression levels to change proportionally to the logarithm of the relative increase in the input signal (Adler et al. 2014), and I1FFLs and NFBLs differ in their ability to filter out noise in the input signal (Buzi and Khammash 2016). If we were to design selection conditions that focused on these subtle differences, we might observe a different result, namely that the two motifs achieve different final fitness. Our results serve as a proof-of-concept for a simpler hypothesis, namely that selection might be not be so focused on such subtle differences but instead simply require pulse-like behavior, and that in this case differences in evolutionary accessibility shape the topology of TRNs during evolution.
Both mutational accessibility (i.e., how often mutations create the given motif) and selective acceptance rates (Yampolsky and Stoltzfus 2001; Stoltzfus and McCandlish 2017; Gomez et al. 2020) contribute to patterns of relative evolutionary accessibility. Usually, mutational accessibility and selective acceptance rates point in the same direction, but not always: we observed conditions under which I1FFLs are less mutationally accessible under early selection, but have a relatively high mutation acceptance rate. The higher acceptance rates for I1FFL-creating mutations do not reflect functional superiority of I1FFLs, but rather the fact that creating NFBLs frequently involves destroying other, likely functional, motifs. Avoidance of damage to existing functions has been previously noted in other discussions of the evolutionary paths taken by TRNs (Wagner 2003; Carroll 2008; Stern and Orgogozo 2009; Sorrells and Johnson 2015). The mutational accessibility of different motifs is not static, but changes over the course of an evolutionary path (Supplementary Table S5).
Besides the optimal peak expression level of the effector, other factors can also change the evolutionary accessibilities of the two motifs. Our results show that a strong input signal, initializing the effector as a repressor, or selecting for rapid response of the effector to the input signal, all act to promote the evolution of I1FFLs rather than NFBLs, without enhancing the relative performance of I1FFLs. Indeed, in our model, these three factors have a larger influence on relative evolutionary accessibilities than does the optimal peak level. One possible explanation is that these factors influence evolutionary accessibility very early in evolution. Even with selection for a high peak, other factors are sufficient to put evolution on a trajectory toward I1FFLs by the time the effector evolves sufficiently high expression levels to facilitate the evolution of NFBLs.
We find that most NFBLs evolve not from connecting previously disconnected genes (e.g., S->E->R), but rather from uncoupling I1FFL-NFBL conjugates in favor of a pure NFBL. We simulate only relatively small TRNs, due to limitations in computational power, and this might restrict the evolutionary trajectories that are capable of generating network motifs. If simulation algorithms that scaled better with TRN size were devised, it would be interesting to explore whether network motifs would evolve via different trajectories in larger TRNs. For example, the use of the same TF for multiple regulatory purposes in real-world TRNs, which of course are larger, can constrain network evolution, requiring complex trajectories to achieve a new regulatory function (Sorrells et al. 2015).
We predicted via simulations that a highly expressed effector favors the evolution of NFBLs. Strikingly, this prediction was borne out in empirical data from yeast. A highly expressed TF can more strongly regulate its target, and/or reduce the amount of noise propagated downstream (Pedraza and van Oudenaarden 2005; Jothi et al. 2009). Once a highly expressed TF gains a TFBS in the target gene, the TFBS may also be easier to retain during evolution. Many studies on TRNs have noted a systematic difference among the expression levels of genes at topologically different positions (Herrgård et al. 2003; Yu et al. 2003; Jothi et al. 2009; Gerstein et al. 2012), and that highly expressed TFs are often regulators of multiple target genes (Jothi et al. 2009; Gerstein et al. 2012). Our findings also support the idea that the observed network motifs in TRNs are partially shaped by the expression levels of TFs, while suggesting that the reasons for this might involve evolutionary accessibility rather than optimality.
Data availability
The source code for our computational model is available at https://github.com/MaselLab/network-evolution-simulator/tree/I1_FFLs. Supplementary material is available at figshare: https://doi.org/10.25386/genetics.16388250.
Acknowledgments
The authors acknowledge the generous computational resources provided by the High Performance Computation Center of the University of Arizona.
Funding
This work was supported by the University of Arizona and by the National Science Foundation awarded to M.G. (#1660648). M.G. also acknowledges support from the A.L. Williams Professorship fund.
Conflicts of Interest
The authors declare that there is no conflict of interest.
Literature cited
- Adler M, Mayo A, Alon U.. 2014. Logarithmic and power law input-output relations in sensory systems with fold-change detection. PLoS Comput Biol. 10:e1003781. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alon U. 2007. Network motifs: theory and experimental approaches. Nat Rev Genet. 8:450–461. [DOI] [PubMed] [Google Scholar]
- Amit I, Citri A, Shay T, Lu Y, Katz M, et al. 2007. A module of negative feedback regulators defines growth factor signaling. Nat Genet. 39:503–512. [DOI] [PubMed] [Google Scholar]
- Artzy-Randrup Y, Fleishman SJ, Ben-Tal N, Stone L.. 2004. Comment on "Network Motifs: Simple Building Blocks of Complex Networks" and "Superfamilies of Evolved and Designed Networks". Science. 305:1107. [DOI] [PubMed] [Google Scholar]
- Balakrishnan R, Park J, Karra K, Hitz BC, Binkley G, et al. 2012. YeastMine—an integrated data warehouse for Saccharomyces cerevisiae data as a multipurpose tool-kit. Database (Oxford). 2012:bar062. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Basu S, Mehreja R, Thiberge S, Chen M-T, Weiss R.. 2004. Spatiotemporal control of gene expression with pulse-generating networks. Proc Natl Acad Sci USA. 101:6355–6360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brown CR, Mao C, Falkovskaia E, Jurica MS, Boeger H.. 2013. Linking stochastic fluctuations in chromatin structure and gene expression. PLoS Biol. 11:e1001621. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buzi G, Khammash M.. 2016. Implementation considerations, not topological differences, are the main determinants of noise suppression properties in feedback and incoherent feedforward circuits. PLoS Comput Biol. 12:e1004958. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Çağatay T, Turcotte M, Elowitz MB, Garcia-Ojalvo J, Süel GM.. 2009. Architecture-dependent noise discriminates functionally analogous differentiation circuits. Cell. 139:512–522. [DOI] [PubMed] [Google Scholar]
- Camas FM, Blázquez J, Poyatos JF.. 2006. Autogenous and nonautogenous control of response in a genetic network. Proc Natl Acad Sci USA. 103:12718–12723. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carroll SB. 2008. Evo-devo and an expanding evolutionary synthesis: a genetic theory of morphological evolution. Cell. 134:25–36. [DOI] [PubMed] [Google Scholar]
- Craig R, Cortens JP, Beavis RC.. 2004. Open source system for analyzing, validating, and storing protein identification data. J Proteome Res. 3:1234–1242. [DOI] [PubMed] [Google Scholar]
- Desiere F, Deutsch EW, King NL, Nesvizhskii AI, Mallick P, et al. 2006. The PeptideAtlas project. Nucleic Acids Res. 34:D655–D658. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ferrell JE. 2016. Perfect and near-perfect adaptation in cell signaling. Cell Syst. 2:62–67. [DOI] [PubMed] [Google Scholar]
- Gancedo C, Flores C-L.. 2008. Moonlighting proteins in yeasts. Microbiol Mol Biol Rev. 72:197–210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gasch AP, Spellman PT, Kao CM, Carmel-Harel O, Eisen MB, et al. 2000. Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell. 11:4241–4257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gerstein MB, Kundaje A, Hariharan M, Landt SG, Yan K-K, et al. 2012. Architecture of the human regulatory network derived from ENCODE data ENCODE Encyclopedia of DNA Elements. Nature. 488:91–100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ghaemmaghami S, Huh W-K, Bower K, Howson RW, Belle A, et al. 2003. Global analysis of protein expression in yeast. Nature. 425:737–741. [DOI] [PubMed] [Google Scholar]
- Gillespie DT. 1977. Exact stochastic simulation of coupled chemical reactions. J Phys Chem. 81:2340–2361. [Google Scholar]
- Gomez K, Bertram J, Masel J.. 2020. Mutation bias can shape adaptation in large asexual populations experiencing clonal interference. Proc Biol Sci. 287:20201503. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Herrgård MJ, Covert MW, Palsson BØ.. 2003. Reconciling gene expression data with known genome-scale regulatory network structures. Genome Res. 13:2423–2434. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jenkins D, Stekel D.. 2010. De novo evolution of complex, global and hierarchical gene regulatory mechanisms. J Mol Evol. 71:128–140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jothi R, Balaji S, Wuster A, Grochow JA, Gsponer J, et al. 2009. Genomic analysis reveals a tight link between transcription factor dynamics and regulatory network architecture. Mol Syst Biol. 5:294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Knabe JF, Nehaniv CL, Schilstra MJ.. 2008. Do motifs reflect evolved function?—No convergent evolution of genetic regulatory network subgraph topologies. Biosystems. 94:68–74. [DOI] [PubMed] [Google Scholar]
- Kuo PD, Banzhaf W, Leier A.. 2006. Network topology and the evolution of dynamics in an artificial genetic regulatory network model created by whole genome duplication and divergence. Biosystems. 85:177–200. [DOI] [PubMed] [Google Scholar]
- Lee TI, Rinaldi NJ, Robert F, Odom DT, Bar-Joseph Z, et al. 2002. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science. 298:799–804. [DOI] [PubMed] [Google Scholar]
- Lynch M. 2007. The evolution of genetic networks by non-adaptive processes. Nat Rev Genet. 8:803–813. [DOI] [PubMed] [Google Scholar]
- Mangan S, Alon U.. 2003. Structure and function of the feed-forward loop network motif. Proc Natl Acad Sci USA. 100:11980–11985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mao C, Brown CR, Falkovskaia E, Dong S, Hrabeta-Robinson E, et al. 2010. Quantitative analysis of the transcription control mechanism. Mol Syst Biol. 6:431. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mazurie A, Bottani S, Vergassola M.. 2005. An evolutionary and functional assessment of regulatory network motifs. Genome Biol. 6:R35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, et al. 2002. Network motifs: simple building blocks of complex networks. Science. 298:824–827. [DOI] [PubMed] [Google Scholar]
- Payne JL, Wagner A.. 2015. Function does not follow form in gene regulatory circuits. Sci Rep. 5:13015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pedraza JM, van Oudenaarden A.. 2005. Noise propagation in gene networks. Science. 307:1965–1969. [DOI] [PubMed] [Google Scholar]
- Rosenfeld N, Elowitz MB, Alon U.. 2002. Negative autoregulation speeds the response times of transcription networks. J Mol Biol. 323:785–793. [DOI] [PubMed] [Google Scholar]
- Ruths T, Nakhleh L.. 2013. Neutral forces acting on intragenomic variability shape the Escherichia coli regulatory network topology. Proc Natl Acad Sci USA. 110:7754–7759. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shen-Orr SS, Milo R, Mangan S, Alon U.. 2002. Network motifs in the transcriptional regulation network of Escherichia coli. Nat Genet. 31:64–68. [DOI] [PubMed] [Google Scholar]
- Shi W, Ma W, Xiong L, Zhang M, Tang C.. 2017. Adaptation with transcriptional regulation. Sci Rep. 7:42648. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shoval O, Alon U.. 2010. SnapShot: network motifs. Cell. 143:326–.e1.. [DOI] [PubMed] [Google Scholar]
- Shoval O, Goentoro L, Hart Y, Mayo A, Sontag E, et al. 2010. Fold-change detection and scalar symmetry of sensory input fields. Proc Natl Acad Sci USA. 107:15995–16000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Solé RV, Valverde S.. 2006. Are network motifs the spandrels of cellular complexity? Trends Ecol Evol. 21:419–422. [DOI] [PubMed] [Google Scholar]
- Sorrells TR, Booth LN, Tuch BB, Johnson AD.. 2015. Intersecting transcription networks constrain gene regulatory evolution. Nature. 523:361–365. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sorrells TR, Johnson AD.. 2015. Making sense of transcription networks. Cell. 161:714–723. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stern DL, Orgogozo V.. 2009. Is genetic evolution predictable? Science. 323:746–751. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stoltzfus A, McCandlish DM.. 2017. Mutational biases influence parallel adaptation. Mol Biol Evol. 34:2163–2172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Teixeira MC, Monteiro P, Jain P, Tenreiro S, Fernandes AR, et al. 2006. The YEASTRACT database: a tool for the analysis of transcription regulatory associations in Saccharomyces cerevisiae. Nucleic Acids Res. 34:D446–D451. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsuda ME, Kawata M.. 2010. Evolution of gene regulatory networks by fluctuating selection and intrinsic constraints. PLoS Comput Biol. 6:e1000873. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van Nimwegen E, Crutchfield JP.. 2000. Metastable evolutionary dynamics: crossing fitness barriers or escaping via neutral paths? Bull Math Biol. 62:799–848. [DOI] [PubMed] [Google Scholar]
- Wagner A. 2003. Does selection mold molecular networks? Sci STKE. 2003:pe41. [DOI] [PubMed] [Google Scholar]
- Wall ME, Hlavacek WS, Savageau MA.. 2004. Design of gene circuits: lessons from bacteria. Nat Rev Genet. 5:34–42. [DOI] [PubMed] [Google Scholar]
- Wang M, Herrmann CJ, Simonovic M, Szklarczyk D, von Mering C.. 2015. Version 4.0 of PaxDb: protein abundance data, integrated across model organisms, tissues, and cell-lines. Proteomics. 15:3163–3168. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Whitlock MC, Phillips PC, Moore FB-G, Tonsor SJ.. 1995. Multiple fitness peaks and epistasis. Annu Rev Ecol Syst. 26:601–629. [Google Scholar]
- Widder S, Solé R, Macía J.. 2012. Evolvability of feed-forward loop architecture biases its abundance in transcription networks. BMC Syst Biol. 6:7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xiong K, Lancaster AK, Siegal ML, Masel J.. 2019. Feed-forward regulation adaptively evolves via dynamics rather than topology when there is intrinsic noise. Nat Commun. 10:2418. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yampolsky LY, Stoltzfus A.. 2001. Bias in the introduction of variation as an orienting factor in evolution. Evol Dev. 3:73–83. [DOI] [PubMed] [Google Scholar]
- Yu H, Luscombe NM, Qian J, Gerstein M.. 2003. Genomic analysis of gene expression relationships in transcriptional regulatory networks. Trends Genetics. 19:422–427. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The source code for our computational model is available at https://github.com/MaselLab/network-evolution-simulator/tree/I1_FFLs. Supplementary material is available at figshare: https://doi.org/10.25386/genetics.16388250.




