Abstract
The evolution of life on earth has been characterized by generalized long-term increases in phenotypic complexity. Although natural selection is a plausible cause for these trends, one alternative hypothesis—generative bias—has been proposed repeatedly based on theoretical considerations. Here, we introduce a computational model of a developmental system and use it to test the hypothesis that long-term increasing trends in phenotypic complexity are caused by a generative bias towards greater complexity. We use our model to generate random organisms with different levels of phenotypic complexity and analyse the distributions of mutational effects on complexity. We show that highly complex organisms are easy to generate but there are trade-offs between different measures of complexity. We also find that only the simplest possible phenotypes show a generative bias towards higher complexity, whereas phenotypes with high complexity display a generative bias towards lower complexity. These results suggest that generative biases alone are not sufficient to explain long-term evolutionary increases in phenotypic complexity. Rather, our finding of a generative bias towards average complexity argues for a critical role of selective biases in driving increases in phenotypic complexity and in maintaining high complexity once it has evolved.
Keywords: evolution of complexity, evolution of development, cell lineages, gene networks, mutational bias, developmental constraints
1. Introduction
The history of life on earth has been marked by dramatic evolutionary increases in complexity, such as the origins of the eukaryotic cell, multicellularity and sociality (Bonner 1988; Maynard Smith & Szathmáry 1995; but see McShea 1991). Many other positive long-term trends in phenotypic complexity have been identified, such as those in the number of cell types of metazoans (Valentine et al. 1994), the heterogeneity of limb pairs in free-living aquatic arthropods (Cisne 1974) and the complexity of septal sutures in ammonoid cephalopods during the Palaeozoic (Saunders et al. 1999). Were most of these long-term increases in complexity caused by a single evolutionary mechanism? Ever since Darwin, natural selection has been viewed as the most compelling ultimate explanation for these trends, even if the precise selective pressures involved remain unclear in most cases (reviewed by McShea 1991).
An alternative explanation for the widespread occurrence of increases in phenotypic complexity is that there is a generative bias towards the production of more complex phenotypes. Here, we follow Arthur (2004) in using the term bias to convey both the positive and negative meanings of developmental constraint (Maynard Smith et al. 1985; Gould 1989). Generative biases are ‘caused by the structure, character, composition, or dynamics of the developmental system’ (Maynard Smith et al. 1985), independently of the action of natural selection (Richardson & Chipman 2003). For example, a combination of mutagenesis, mutation accumulation and selection experiments in the nematode, Caenorhabditis elegans, suggests that there is a generative bias towards the reduction of body size in that species (Azevedo et al. 2002). If generative biases towards greater phenotypic complexity are common, they can, in principle, drive trends in complexity (Yampolsky & Stoltzfus 2001; Waxman & Peck 2003), even if these biases are overwhelmed by natural selection or genetic drift in most evolutionary lineages, for most of the time.
Several theoretical arguments have been invoked in favour of the generative bias hypothesis. An important generalization from the study of complex biological systems is that, as Murray Gell-Mann famously put it, ‘surface complexity’ can arise ‘out of deep simplicity’ (Kauffman 1993; Goodwin 1994). This has led some complexity theorists to argue that natural selection has been relatively unimportant to the evolution of complexity. For example, Goodwin (1994) put forward a generative bias hypothesis, although he left the mechanistic details vague. A more explicit proposal has been advanced recently by McShea (2005): any measure of the complexity of a biological system that can be expressed as the heterogeneity of its constituent parts will tend to increase spontaneously over time. Examples of such ‘internal variance’ measures of complexity (McShea 2005) include the number of cell types of an organism (Valentine et al. 1994), the number of different types of limb pairs in an arthropod (Cisne 1974) and the number of splice variants per gene (Kim et al. 2004). Some measures of complexity, such as the number of cells (Bonner 1988) or the number of interactions between genes (Szathmáry et al. 2001), are not included in the internal variance category because they do not take the heterogeneity of parts into account. Briefly, McShea (2005) argues that the accumulation of random variation in different parts of a system causes them to become more and more differentiated, thereby leading to an increase in the complexity of the system. Analogous arguments have been formulated in terms of entropy (Brooks et al. 1989).
Are these theoretical arguments supported by any empirical evidence? One prediction of the generative bias hypothesis is that mutagenesis and mutation accumulation experiments should uncover generalized increases in phenotypic complexity. Over a century of genetic research has failed to confirm this prediction for any organism (Drake et al. 1998), although the distribution of mutational effects on a complexity metric has never actually been measured directly. Here, we introduce a computational model of a developmental system and use it to test the hypothesis that long-term increasing trends in phenotypic complexity are caused by a generative bias towards higher complexity, as proposed by McShea (2005). We begin by generating random organisms under the model and investigating the distribution of different measures of their phenotypic complexity. This allows us to evaluate the range of phenotypic complexity that can be generated in our model. The generative bias hypothesis can apply only to the developmental systems capable of producing relatively complex organisms without substantial changes in genomic complexity. We then generate organisms with low, intermediate and high phenotypic complexity, and test for the presence of negative genetic correlations between different measures of complexity. Trade-offs would restrict the ability of generative biases to cause concurrent increases in the measures of complexity involved. Finally, we measure the distributions of the effects of mutations on complexity to test the central prediction of the generative bias hypothesis that mutations tend to increase complexity, independently of the complexity of the ancestral organism.
2. Model
(a) Gene expression
Our model extends a ‘unicellular’ gene network model that has been used to study the evolution of robustness (Siegal & Bergman 2002; Azevedo et al. 2006). Organisms are modelled as collections of cells, each expressing a number (N) of interacting transcription factors. The product of each gene can regulate its own expression as well as the expression of other genes through cis-regulatory elements. These interactions constitute a gene network, represented by a N×N matrix R, whose elements rij describe the regulatory effect of the product of gene j on the expression of gene i (figure 1a). The diagonal elements rii represent autoregulation of gene i by its own gene product.
Figure 1.
Developmental model used in this paper. (a) Artificial transcriptional regulatory network consisting of N=4 genes with K=2 incoming interactions per gene on average (a small network is used for clarity). Gene 1 acts as a cell-cycle regulator and gene 2 acts as an asymmetric cell division gene. Activations and repressions are denoted by arrows and bars, respectively. Numbers indicate the relative interaction strengths, i.e. the elements of the R matrix (e.g. r13=1.7). (b) Proliferative lineage specified by the network in (a) over four successive time-steps (Dmax=4), given the initial gene expression pattern, S(0)=(1, 1, 1, 1). Cells are represented by circles, and the expression states of the four genes in the network are represented by the filling of sectors in the same relative position as the genes in (a): OFF, white; ON, black. For example, all genes except gene 4 are ON in the left daughter of the mother cell of the lineage shown in (b) (1110 in binary form). This lineage contains three RR2: rule 1, 1111→{1110, 1010}; rule 2, 1110→{1111, 1111}; and rule 3, 1010→{1111, 1111}. Since MCT=1, the number of RR2 matches the number of different cell states in the lineage exactly. Since none of the cell states is differentiated, given more time to develop, the cell lineage would continue to expand indefinitely, iteratively generating the same cell states. (c, d) Examples of lineages generated by other random networks with N=4 and K=2. Lineage (c) is a simple stem cell lineage containing two RR2: rule 1, 1111→{0011, 1111}, a stem cell that would continue to divide indefinitely; rule 2, 0011
, a differentiated cell that would not divide further (Dmax=6). Lineage (d) is an example of an irregular lineage with RR2=6, UCT=2 and MCT=1.33 (12 internal cells with a cell cycle time of 1, and 6 internal cells with a cell cycle time of 2; Dmax=5).
The gene expression pattern in a cell at time t during development is represented by a state vector S(t) whose elements si(t) describe the expression states of genes i=1, 2, …, N. The expression state of a gene can vary continuously between complete repression, si(t)=−1, and complete activation, si(t)=1. Gene expression states are updated synchronously, in discrete time-steps according to the following equation:
| (2.1) |
where f(x)=2/(1+e−ax)−1 is a sigmoidal filter function that determines how the total regulatory input from the network influences the expression of the gene (Siegal & Bergman 2002; Azevedo et al. 2006). The activation constant a determines the nature of the transition between expression states −1 and 1, from a step-like transition at high values to a gradual transition at low values. Gene i is considered to be ON at time t when 0<si(t)≤1, and OFF when −1≤si(t)≤0.
(b) Cell division
The unicellular model mentioned in §2a can represent only the changes in the gene expression patterns of individual cells. The dynamics of gene expression can be treated as a model for cell fate specification (Britten & Davidson 1969; Kauffman 1993). Although such models have been fruitful for both developmental and evolutionary biology, they are not sufficient to model developmental and morphological complexity (Keränen 2004; Geard & Wiles 2005). Several developmental processes have been incorporated into developmental models based on gene networks, such as cell division and death (Furusawa & Kaneko 2001), differential cell adhesion (Hogeweg 2000), cell movement (Platzer & Meinzer 2004), cell–cell induction (Salazar-Ciudad et al. 2000; Keränen 2004) and diffusion of morphogens (Reinitz & Sharp 1995). Here, we model a basic cell-autonomous developmental mechanism: asymmetric cell division (Fichelson et al. 2005). This is accomplished by conferring special roles on two genes (1 and 2):
Gene 1 is a cell-cycle regulator. When gene 1 switches ON at time t, the cell divides instantaneously, and gene 1 is turned OFF in both daughter cells, s1L(t)=s1R(t)=−1, where the subscripts L and R refer to the left and right daughter cells, respectively.
Gene 2 is an asymmetric cell division gene. If it is ON when gene 1 switches ON (at time t), then the cell divides asymmetrically. The product of gene 2 segregates asymmetrically among the daughter cells: it is turned OFF in the left daughter, s2L(t)=−1, and stays ON in the right daughter cell, at the same expression level as in the mother cell, s2R(t)=s2(t). If gene 2 is OFF when gene 1 switches ON, then the cell divides symmetrically and both daughters retain the gene expression pattern of the mother cell. Regardless of the expression level of gene 2, the products of all other genes (3, 4, …, N) segregate symmetrically between daughter cells. The expression levels of all genes in both daughter cells at time t+1 are determined by equation (2.1).
The behaviour of gene 2 is reminiscent of that of the transcription factor Prospero in the Drosophila neuroblast lineage (Fichelson et al. 2005). A close association between the regulation of the cell cycle and asymmetric cell division during development has been observed in many systems (Fichelson et al. 2005).
(c) Genotype, epigenetic state and development
The genotype of an organism is given by the matrix of regulatory interactions, R. The initial pattern of gene expression in the first cell before development begins, S(0), is determined by regulatory factors upstream of the network, such as the maternal, embryonic or external environments. Throughout the paper, we define S(0) as the epigenetic state of the organism (defined as the environment in Azevedo et al. 2006).
Many kinds of precursor cells in animals undergo a limited number of rounds of cell division before they stop dividing and terminally differentiate into specialized post-mitotic cells (Conlon & Raff 1999). In our model, a cell is said to differentiate when it reaches a pattern of expression where gene 1 is stably switched OFF (see below). Development proceeds until either all the cells have differentiated or the first undifferentiated cell has completed Dmax rounds of cell division. The cells present at the end of development (differentiated or not) are defined as terminal.
A terminal cell born at time θ is allowed to develop until time θ+tmax. If gene 1 switches ON at time θ<t≤θ+tmax, then the terminal cell is considered to be undifferentiated, and its gene expression pattern is defined as S(t). If gene 1 does not switch ON before or at time θ+tmax, then the terminal cell is said to have differentiated. The expression pattern of a differentiated cell is then evaluated for dynamic stability over the period [θ+tmax/2, θ+tmax]: if it reaches stability (see electronic supplementary material for details), we define its expression pattern as S(θ+tmax) and classify it as ‘stable’; if the cell does not reach dynamic stability, we define its expression pattern as the average expression pattern over the period [θ+tmax/2, θ+tmax] and classify it as ‘oscillatory’.
(d) Cell lineage
Once development is completed, the resulting ‘organisms’ have a cell lineage depth (D, maximum number of consecutive rounds of cell division in the lineage) of 0≤D≤Dmax and a number of terminal cells (M) of 1≤M≤2D. We classify cell lineages generated by our model into four categories:
Non-developing. The initial cell does not divide within tmax steps (i.e. gene 1 never comes ON) and M=1.
Proliferative. Cells undergo 1≤D≤Dmax successive rounds of cell division resulting in M=2D cells of equal depth D (figure 1b). If D<Dmax, then all terminal cells are differentiated; if D=Dmax, then any number of terminal cells may be undifferentiated.
Stem cell. A lineage consisting of one or more stem cells that divide Dmax times and remain undifferentiated, and at least Dmax differentiated stem cell descendants (figure 1c).
Irregular. A lineage of depth 2≤D≤Dmax that cannot be reduced to one of the simpler descriptions. Often it consists of a mixture of different proliferative and/or stem cell sublineages (figure 1d; figure 6 in the electronic supplementary material).
(e) Complexity
Our model allows us to consider organismal complexity at three different levels of organization: genomic; developmental; and morphological. Ultimately, the phenotype is determined by the gene network, its epigenetic state and the rules that govern gene action, all of which may be thought of as being encoded by an ‘implicit’ genome and epigenetic machinery. We measure the complexity of the gene network in two ways (Szathmáry et al. 2001): number of genes (N) and average in-degree or number of incoming gene interactions per gene, including self-interactions (K). Increasing either N or K increases the size of the ‘genetic programme’ of the organism, i.e. the complexity of that programme (Szathmáry et al. 2001).
The genetic programme generates the morphology through a pattern of cell divisions and changes in the gene expression patterns of cells—the cell lineage. We consider three measures of developmental complexity. First, we introduce a new measure (RR2) of the complexity of the cell lineage (Simon 1962; Szathmáry et al. 2001) based on the ‘reduced rules’ metric (RR; Braun et al. 2003; Azevedo et al. 2005). Briefly, we begin by coding the cell lineage as a series of unique rules, each corresponding to a cell division. These rules can take one of two forms: X→{Y, Z} (cell X divides into cells Y and Z) or W
(cell W is a terminal cell). The letters correspond to ‘binarized’ gene expression patterns for all N genes in each cell at the time of cell division or differentiation. For example, two cells, X and Y, might show the following expression patterns at genes 1–8:
Their respective binarized expression patterns are: B(X)=10101011 and B(Y)=10011011. This initial list of rules provides a complete description of the patterns of cell division and cell fate specification in the lineage, ignoring planes of cell division and changes in gene expression between cell divisions. We then compress the initial description by successively collapsing equivalent rules until we obtain a set of reduced rules RR2 encoding a complete non-redundant description of the lineage equivalent to the initial one (Azevedo et al. 2005). RR2, unlike the earlier RR measure, may include recursive rules where a daughter cell has the same expression pattern as the mother: X→{X, Y} (i.e. a stem cell rule). In addition, all terminal cells define RR2 (but not RR). If there are two or more rules involving an expression pattern X (X→{A, B}, X→{C, D}, …), they are counted as different RR2 (X1, X2, …). Although this can occur as a result of the use of binarization and incomplete information to define the rules, we observed it only rarely in our simulations (in a random sample of 100 000 lineages (N=8, K=4), only 0.64% of the total number of RR2 were rules of this kind). An organism with M terminal cells may show a RR2 complexity as low as 1 (e.g. a proliferative rule of the form X→{X, X} iterated log2M times) or as high as 2M−1 (each cell in the lineage has a different gene expression pattern).
The second measure of the complexity of a cell lineage is the mean cell cycle time (MCT) of all cells in the lineage except the terminal cells, in number of time-steps. MCT is strongly and positively correlated with the amount of change in gene expression over the entire lineage (figure 14 in the electronic supplementary material), in much the same way that time is correlated with the amount of genetic change in a phylogenetic tree (branch lengths). Given that the RR2 metric ignores changes in gene expression between cell divisions, it may underestimate the complexity of a cell lineage with a high MCT. Since MCT does not capture the heterogeneity in patterns of change in gene expression across cells in the lineage, we introduce a third measure of cell lineage complexity: the number of unique cell cycle times (UCT) in the cell lineage, excluding the terminal cells.
The morphological complexity of the organism is measured by the number of terminal cell types (TT; Valentine et al. 1994). The terminal cells are classified into different cell types, according to the final binarized expression patterns at four genes (3–6) and their differentiation state: undifferentiated; differentiated stable; or differentiated oscillatory. Thus, a terminal cell can take one of 16×3=48 different cell types. For example, if the two hypothetical X and Y cells discussed above were terminal cells, then they would be classified into the cell types 1010 and 0110, respectively (both undifferentiated, because s1>0).
Throughout the paper, we treat genomic complexity as a constant and concentrate on variation in developmental (RR2, MCT and UCT) and morphological complexity (TT). The complexity metrics, RR2, UCT and TT, are all measures of the heterogeneity in the genetic states of cells and, therefore, examples of what McShea (2005) calls internal variance. Although MCT is not, itself, an internal variance measure, it is strongly correlated with an internal variance measure (figure 14 in the electronic supplementary material).
3. Material and methods
(a) Random organisms
To establish the null distribution of different complexity measures, we generated 100 000 random networks for the following combinations of number of genes (N) and average in-degree (K): N=8, 16 or 32 with K=4; and N=16 with K=2 or 8. Networks were generated by randomly filling the R matrix with N×K standard normal random variates and N(N−K) zeros. An initial gene expression pattern, S(0), was created for each network by setting s1(0)=−1 and randomly setting each si(0) for i=1, 2, 3, …, N to either −1 or 1. The initial state of gene 2 had a negligible effect on the results (figures 9 and 10 in the electronic supplementary material).
(b) Default parameters
Except when otherwise stated, the following parameters were used throughout:
Activation constant: a=100 (i.e. activation function is approximately a Heaviside step function).
Critical period of time used to establish whether a cell divides or differentiates: tmax=50 (see electronic supplementary material methods).
Maximum number of rounds of cell division: Dmax=6.
(c) Mutations and epimutations
To investigate the presence of generative biases on phenotypic complexity, we consider mutations and epimutations (i.e. epigenetic mutations; Cubas et al. 1999). A mutation is modelled by the substitution of a random non-zero element in the R matrix by an independent standard normal random variate. In our model, mutations can be viewed as acting on the N×K cis-regulatory elements, not on the coding sequences of the N genes. Mutations can neither alter the number of genes nor create new cis-regulatory elements. An epimutation is modelled as a change in the initial gene expression pattern, S(0). The expression patterns of two randomly selected genes are modified by replacing their default expression levels si(0) with uniform random variates between −1 and 1. Changes to the initial expression level of gene 1 (the cell cycle gene), s1(0), are discarded.
(d) Generative bias
We generated random cell lineages from gene networks with N=8 and K=4 until 1000 lineages were found for each of the following complexity ranges:
TT: 1, 2, 3–4, 5–9, 10–14, ≥15
RR2: 1–2, 3–6, 7–12, 13–20, 21–30, ≥31
MCT: 1–1.5, 1.5–2.5, 2.5–4, 4–7, 7–11, ≥11
UCT: 1, 2, 3, 4, 5, ≥6
These 24 sets of random lineages were used to investigate the patterns of covariation between different measures of phenotypic complexity. We then took each organism and generated 100 mutated and 100 epimutated copies, and assayed the change in the complexity metric in the perturbed individuals capable of development (approx. 98% of the total), relative to the complexity metric of the original unperturbed individual. We calculated the mean, standard deviation (s.d.) and coefficient of skewness of the distribution of effects, the probability that a mutation or epimutation is ‘complexifying’ (i.e. causes an increase in complexity) and the mean effects of complexifying and simplifying mutations and epimutations.
(e) Stabilizing selection
We investigated the effect of superimposing stabilizing selection on the distribution of the effects of mutations and epimutations by repeating the simulations described above, but until 100 perturbed individuals with the same numbers of terminal cells (M) and terminal cell types (TT) as the unperturbed individuals were produced.
(f) Sensitivity to network parameters
To test the influence of the number of genes and average in-degree on the experiments outlined in § 3d,e, we repeated all simulations using the following network parameters: N=16/K=2 and N=16/K=8.
4. Results
(a) Complexity for free
All types of gene networks tested were capable of generating a majority of developing cell lineages (80–98%; figure 2), most of which include three or more rounds of cell division (87–96%). Network parameters influenced the kinds of phenotypes produced. Increasing the number of genes (N=8, 16, 32) while keeping the average in-degree constant (K=4) increased the probability that the organisms developed (i.e. had at least one cell division), and increased the probability that the resulting lineage was proliferative (figure 2a). Increasing the average in-degree (K=2, 4, 8) while keeping the number of genes constant (N=16) also increased the probability that the organisms developed, but it decreased the probability that the resulting lineage was proliferative; instead, the probability of obtaining an irregular lineage increased (figure 2b).
Figure 2.
Increasing either the number of genes or average in-degree increases the probability of obtaining developing cell lineages. Classification of 100 000 random lineages generated from gene networks with varying (a) number of genes, N, or (b) average in-degree, K. Cell lineage categories: no dev, non-developing; prolif, proliferative; stem, stem cell. In all cases, proliferative and irregular cell lineages are common and stem cell lineages are rare.
Increasing K led to a proportional increase in the number of unique cell lineage topologies and terminal cell morphologies generated (figure 7b in the electronic supplementary material). In contrast, changing N did not consistently influence either the number of different lineage topologies or the number of terminal cell type morphologies produced (figure 7a in the electronic supplementary material, but see figure 8b in the electronic supplementary material).
All types of networks were capable of generating highly complex cell lineages and morphologies, albeit with low probability (figures 3 and 4). Increasing either N or K tended to increase the complexity of the ‘average’ organism produced, except for the TT metric in response to N (figures 3 and 4; figures 11–13 in the electronic supplementary material). However, increasing network complexity over two orders of magnitude only increased average phenotypic complexity by threefold on average (s.d.=2.4; figure 12 in the electronic supplementary material).
Figure 3.
Larger networks tend to generate more complex cell lineages. (a) Morphological and (b–d) developmental complexities were measured in lineages with at least three rounds of cell division, specified by random gene networks with the same average in-degree (K=4) but different number of genes, N (subset of data shown in figure 2a). Plots show the empirical complementary cumulative distribution functions of each complexity measure X: 1−F(x)=Prob(X>x). For example, approximately 1% of lineages show TT>7 terminal cell types (a). The distributions of TT, RR2 and UCT (a,b,d) are approximately exponential, such that 1−F(x)=exp(−x/β), where β is a scale parameter. Moments: expectation, E(X)=β; variance, Var(X)=β2. Dotted lines show rough exponential fits: (a) β=1.35, (b) β=9.00 and (d) β=0.75 (fit displaced horizontally by +1). (c) MCT is more closely approximated by a power-law distribution: 1−F(x)≈x−α, as x→∞, with exponent α=3.00 (dotted line). The RR2 and MCT data were binned in regular intervals (b,c). The 1−F(x) axes were log transformed in all plots. The MCT axis was also log transformed (c). The increase in RR2 with the number of genes is predictable, because the number of possible gene expression states increases with N (b). However, the increases in MCT and UCT with N are not trivial (c,d).
Figure 4.
More highly connected networks tend to generate more complex cell lineages. (a) Morphological and (b–d) developmental complexities were measured in lineages with at least three rounds of cell division, specified by random gene networks with the same number of genes (N=16) but different average in-degree, K (subset of data shown in figure 2b). Plots were constructed exactly as described in figure 3.
We repeated the experiments summarized in figures 2–4 with a shallower activation function (a=5). The overall patterns obtained were generally similar to those described above (not shown).
(b) Complexity trade-offs
To investigate the patterns of covariation among phenotypic complexity metrics, we generated random networks with N=8 and K=4 that specified cell lineages falling within narrow ranges of each complexity metric. For example, organisms differing strongly in TT also differed strongly in RR2 in the same direction, and vice versa (top left of figure 15 in electronic supplementary material); however, organisms with high TT and RR2 tended to show low MCT and UCT (bottom left of figure 15 in electronic supplementary material). Figure 6 in the electronic supplementary material shows an example of a lineage with high UCT but intermediate TT, RR2 and MCT. Another indication that the different measures of complexity are not all identical is that they showed different patterns of covariation with number of terminal cells (M), another morphological trait: increases in TT, RR2 and MCT were broadly associated with increases in M, whereas the reverse was true for UCT (figure 18 in the electronic supplementary material).
(c) A generative bias towards average complexity
Organisms specified by networks with N=8 and K=4 displayed a clear pattern of generative bias for all phenotypic complexity metrics, for both mutations and epimutations: when perturbed, on average, relatively simple organisms generated more complex ones, whereas relatively complex organisms generated simpler ones (figure 5). Organisms of average phenotypic complexity (data obtained from figure 3) showed no overall generative bias. The change in the magnitude of generative bias with phenotypic complexity was approximately linear for all metrics (figure 5). Although terminal cell number (M) was correlated with different measures of complexity, the generative biases in complexity were not explained by a generative bias in M (figure 18 in the electronic supplementary material).
Figure 5.
Both mutations and epimutations cause organisms to generate cell lineages that are closer in complexity to the average complexity expected for gene networks of the same number of genes and average in-degree (figure 3, open symbols). The distribution of mutational effects on a given complexity metric was obtained by generating 100 mutated copies of each organism produced within a narrow range of that metric (columns of figure 15 in electronic supplementary material), and calculating the difference between the complexity of a perturbed individual and that of the original unperturbed one. The distribution of the effects of epimutations was calculated in a similar way from 100 epimutated copies of the same organisms. We then calculated the following statistics based on the distribution of mutational or epimutational effects for each individual: (a) mean effect (dotted lines show the expectation of no bias), (b) standard deviation, (c) coefficient of skewness, (d) proportion of positive effects, and (e) mean positive and negative effects. Values are means and 95% CIs for each statistic based on 1000 random developing lineages. For each measure, the shaded area spans from the minimum possible complexity to its mean value (based on the data shown in figure 3, open symbols). Closed and open symbols represent mutations and epimutations, respectively. Solid and dashed lines represent the absence or presence of strong stabilizing selection, respectively.
Highly complex phenotypes (i.e. those corresponding to the three rightmost points in each panel of figure 5) showed a relatively constant generative bias expressed as a proportion of the complexity of the unperturbed phenotype (not shown). For the internal variance measures (TT, RR2 and UCT), the mean effect of a mutation was to decrease complexity by 18–20%, and that of an epimutation was to decrease complexity by 12–14%; the mean effects of mutations and epimutations on MCT were to decrease it by 26 and 8%, respectively.
The standard deviation of the distribution of effects also increased with complexity for all metrics (figure 5). However, this pattern does not imply that more complex organisms had a greater evolvability towards higher complexity because the increase in variability was accompanied by an increasingly negative skew, a declining probability that a mutation or epimutation is complexifying and an increased magnitude of the effects of simplifying (but not complexifying) mutations and epimutations (figure 5).
(d) Stabilizing selection
In the simulations described in §4c, the distributions of the effects of mutations and epimutations were calculated based on all perturbed organisms capable of development (figure 5, solid lines). This makes the extreme assumption that all developing morphologies are equally viable and no selection is acting during development. Can unbiased stabilizing selection on the final morphology eliminate the generative bias towards average developmental complexity? To answer this question, we calculated the distributions of the effects of mutations and epimutations when only perturbed organisms with the same numbers of terminal cells (M) and cell types (TT) as the unperturbed ancestors survived. We extended stabilizing selection to M because it was correlated with different measures of complexity (figure 18 in the electronic supplementary material). Our results show that, in the presence of strong stabilizing selection, the biases in the unselected developmental complexity measures (RR2, MCT and UCT) became weaker but did not disappear (figure 5, dashed lines). The patterns in skewness and the mean effects of simplifying and complexifying mutations were largely unaffected by stabilizing selection, suggesting that the reduction in magnitude of the bias was caused by an increase in the proportion of neutral mutations and epimutations.
(e) Sensitivity to network parameters
We repeated the experiments reported in §§4b–d with the following network parameters: N=16/K=2 and N=16/K=8. The overall patterns obtained were identical to those shown in figure 15 in the electronic supplementary material and figure 5 (figures 16, 17, 19 and 20 in the electronic supplementary material).
5. Discussion
(a) Generative biases
The generative bias hypothesis for the evolution of phenotypic complexity posits that mutations alone cause complexity to increase over time. The minimum requirement for such a mechanism to operate is for high phenotypic complexity to be relatively easy to generate by developmental systems. Our results are consistent with this condition. We constructed one of the simplest models imaginable of a cell-autonomous developmental mechanism, involving a handful of transcriptional regulators, interacting in a straightforward way, and found that it was able to generate high levels of phenotypic complexity (figures 3 and 4). For example, 20% of 76 602 random cell lineages with six rounds of cell division (mean number of terminal cells M=50, s.d.=17) based on networks with N=8 and K=4 had a higher relative complexity than the cell lineage of the ascidian Halocynthia roretzi up to the tissue-restricted stage (RR metric, data not shown; Azevedo et al. 2005).
While the observation that simple models of developmental mechanisms based on gene networks can generate high phenotypic complexity is not novel (Kauffman 1993; Hogeweg 2000; Salazar-Ciudad et al. 2000; Furusawa & Kaneko 2001; Keränen 2004), we believe that all these studies, taken together, have profound implications for the evolution of complexity. They suggest that a simple unicellular eukaryote at the ‘dawn of multicellularity’ (Newman 2002) would have been able to evolve a complex multicellular phenotype relatively easily, once it had acquired some basic mechanisms for asymmetric cell division, cell adhesion and cell–cell induction (Wolpert 1990; Maynard Smith & Szathmáry 1995; Hogeweg 2000; King 2004). This prediction is consistent with the observation that multicellularity has evolved several times independently within the eukaryotes (King 2004). However, even if it is easy to evolve high phenotypic complexity, it does not follow that generative biases towards greater complexity actually exist.
One prediction of the generative bias hypothesis, proposed by McShea (2005), is that there should be no trade-offs between different measures of phenotypic complexity. Our results do not support this prediction for any complexity measure. Rather, increases in RR2 are accompanied by decreases in two other measures, MCT and UCT (figure 15 in the electronic supplementary material). These results indicate that generative biases are unlikely to achieve simultaneous increases in all measures of complexity. Although this result does not entirely rule out that generative biases can drive increases in some measures of complexity, it does limit the scope of the generative bias hypothesis.
The central prediction of the generative bias hypothesis is that mutations and epimutations should, on average, cause complexity to increase. Our data support this prediction for organisms of approximately minimal phenotypic complexity (‘nowhere to go but up’; Valentine et al. 1994), but not for those of average complexity or higher. We found evidence for a generative bias, but it is one towards average, not high, complexity. To use a metaphor, phenotypic complexity is distributed like wealth (Drăgulescu & Yakovenko 2001), with many complexity ‘poor’ and few complexity ‘rich’ individuals. Mutations and epimutations of organisms in different complexity ‘brackets’ do not act as a ‘get rich scheme’. Instead of an ‘everyone gets richer’ pattern, our experiments reveal a more egalitarian principle whereby ‘the poor get richer and the rich get poorer’.
Our results have failed to support the internal variance principle, despite its seemingly generic nature (McShea 2005). Is this discrepancy caused by an artefact of our model? We doubt it. To explain the internal variance principle, McShea (2005) introduces an abstract model. He imagines a structure composed of several parts (e.g. segments) and defines the complexity of the structure as the variance in the sizes of its parts. He then considers the effect of adding ‘random heritable variation’ to each part independently over time and concludes that complexity will tend to increase through time. However, this ‘paradigmatic case’ makes one central assumption: ‘each part is treated independently’ (McShea 2005), i.e. the structure is perfectly modular. This assumption is unrealistic for biological systems, which are neither perfectly integrated nor perfectly modular (Simon 1962; Schlosser & Wagner 2004). Although McShea (2005) claims that the internal variance principle will apply to ‘any organism with some degree of modular construction’ and that an analogous principle ‘could be developed for a hypothetical organism without any clear compartmentalization’, he does not offer any supporting evidence for these assertions. The organisms in our model are clearly compartmentalized into cells and the model makes no explicit assumption about the degree of modularity; therefore, our simulations provide an adequate test of the generality of the internal variance principle.
One potential criticism of our result is that the genomic complexity of even the simplest multicellular prokaryotes and eukaryotes is vastly greater than that of the organisms in our model. This consideration suggests that the ‘average’ phenotypic complexity supported by real genetic systems will be correspondingly greater, thus increasing the range over which a generative bias towards increased complexity might apply. However, this argument is implausible for two reasons. First, a relationship between genomic and phenotypic complexities in real organisms has been notoriously elusive. For example, the amount of DNA in a haploid genome varies over five orders of magnitude among eukaryotes, but this variation is not correlated with morphological complexity (Gregory 2001). Second, increasing genomic complexity in our model over two orders of magnitude achieves only modest increases in average phenotypic complexity (figure 12 in the electronic supplementary material). Therefore, we believe that our results can be extrapolated over a wide range of genomic and morphological complexities.
(b) Selective biases
If generative biases are not sufficient to drive long-term increases in phenotypic complexity, what are the likely evolutionary mechanisms involved? The obvious alternatives are selective biases. Selection for increased complexity would be an example of a selective bias. Possible sources for this kind of selective bias include selection for mechanical efficiency (Bonner 1988), adaptation to complex environments (Lenski et al. 2003) and evolutionary arms races (Dawkins & Krebs 1979). For example, selection for improved visual acuity can explain the evolution of a complex camera-type eye from a patch of light-sensitive epithelium (Nilsson & Pelger 1994). A more subtle selective bias can occur if simplifying mutations tend to be more deleterious than complexifying ones (Saunders & Ho 1976). In a recent computational study, Soyer & Bonhoeffer (2006) found evidence for this type of selective bias in the evolution of complexity in signalling pathways. Selective biases might also take other forms, such as an increased extinction rate of simpler organisms or an increased speciation rate of more complex ones (McShea 1994). Since adaptation, speciation and diversification can cause environmental complexity to increase (Odling-Smee et al. 1996), the process of evolution could itself cause generalized increasing trends in phenotypic complexity.
Our results neither substantiate nor refute any of the selective biases mentioned above. However, they do suggest that, while high complexity may be easy to generate, it is also easy to degrade through mutations and epimutations. Furthermore, strong stabilizing selection is not sufficient to counteract the generative bias towards average complexity (figure 5; Waxman & Peck 2003). In fact, strong selective biases may not be enough in most cases. For example, mutating 10 000 randomly generated organisms with high morphological complexity (TT=10; network parameters, N=8 and K=4) is 7.8 times more likely to result in simplification than in complexification. When strong stabilizing selection is applied, such that only increases or decreases of up to one cell type survive, simplifying mutations are still 2.3 times more likely to occur than complexifying ones. And even when a strong selective bias of the type proposed by Saunders & Ho (1976) is added, such that any decreases of more than one cell type die but all increases survive, simplifying mutations are still 1.3 times more likely to occur than complexifying ones. Note that although generative biases and selection acting during development are not easy to distinguish in practice in real organisms (Richardson & Chipman 2003), they are clearly separable in our model.
In conclusion, we predict that most complex organisms experience stabilizing selection or selective biases towards increased phenotypic complexity. This hypothesis would explain the striking association between complexity and robustness in a range of biological systems (Edelman & Gally 2001; Csete & Doyle 2002; Flatt 2005). Another prediction of our hypothesis is that relaxing selection on a structure should lead to its simplification. The spectacular secondary body plan simplification of parasites like Pentastomida, Sacculina, Xenoturbella and Mesozoa is consistent with this prediction (Dewel 2000). For complex organisms, it may indeed take ‘all the running you can do, to keep in the same place’ (Van Valen 1973).
Acknowledgments
We thank A. Gloria-Soria, A. M. Leroi, B. Cole, H. Teotónio, M. Travisano and R. Zufall for their helpful comments and discussion. R.L. was supported by the University of Houston Grants to Enhance and Advance Research Program. N. L. G. was supported by a grant from the ARC Centre for Complex Systems.
Supplementary Material
Supplementary methods and supplementary figures 7–16
References
- Arthur W. The effect of development on the direction of evolution: toward a twenty-first century consensus. Evol. Dev. 2004;6:282–288. doi: 10.1111/j.1525-142X.2004.04033.x. doi:10.1111/j.1525-142X.2004.04033.x [DOI] [PubMed] [Google Scholar]
- Azevedo R.B.R, Keightley P.D, Laurén-Määttä C, Vassilieva L.L, Lynch M, Leroi A.M. Spontaneous mutational variation for body size in Caenorhabditis elegans. Genetics. 2002;162:755–765. doi: 10.1093/genetics/162.2.755. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Azevedo R.B.R, et al. The simplicity of metazoan cell lineages. Nature. 2005;433:152–156. doi: 10.1038/nature03178. doi:10.1038/nature03178 [DOI] [PubMed] [Google Scholar]
- Azevedo R.B.R, Lohaus R, Srinivasan S, Dang K.K, Burch C.L. Sexual reproduction selects for robustness and negative epistasis in artificial gene networks. Nature. 2006;440:87–90. doi: 10.1038/nature04488. doi:10.1038/nature04488 [DOI] [PubMed] [Google Scholar]
- Bonner J.T. Princeton University Press; Princeton, NJ: 1988. The evolution of complexity. [Google Scholar]
- Braun V, Azevedo R.B.R, Gumbel M, Agapow P.-M, Leroi A.M, Meinzer H.-P. ALES: cell lineage analysis and mapping of developmental events. Bioinformatics. 2003;19:851–858. doi: 10.1093/bioinformatics/btg087. doi:10.1093/bioinformatics/btg087 [DOI] [PubMed] [Google Scholar]
- Britten R.J, Davidson E.H. Gene regulation for higher cells: a theory. Science. 1969;165:349–357. doi: 10.1126/science.165.3891.349. doi:10.1126/science.165.3891.349 [DOI] [PubMed] [Google Scholar]
- Brooks D.R, Collier J, Maurer B.A, Smith J.D.H, Wiley E.O. Entropy and information in evolving biological systems. Biol. Phil. 1989;4:407–432. doi:10.1007/BF00162588 [Google Scholar]
- Cisne J.L. Evolution of the world fauna of aquatic free-living arthropods. Evolution. 1974;28:337–366. doi: 10.1111/j.1558-5646.1974.tb00757.x. doi:10.2307/2407157 [DOI] [PubMed] [Google Scholar]
- Conlon I, Raff M. Size control in animal development. Cell. 1999;96:235–244. doi: 10.1016/s0092-8674(00)80563-2. doi:10.1016/S0092-8674(00)80563-2 [DOI] [PubMed] [Google Scholar]
- Csete M.E, Doyle J.C. Reverse engineering of biological complexity. Science. 2002;295:1664–1669. doi: 10.1126/science.1069981. doi:10.1126/science.1069981 [DOI] [PubMed] [Google Scholar]
- Cubas P, Vincent C, Coen E. An epigenetic mutation responsible for natural variation in floral symmetry. Nature. 1999;401:157–161. doi: 10.1038/43657. doi:10.1038/43657 [DOI] [PubMed] [Google Scholar]
- Dawkins R, Krebs J.R. Arms races between and within species. Proc. R. Soc. B. 1979;205:489–511. doi: 10.1098/rspb.1979.0081. doi:10.1098/rspb.1979.0081 [DOI] [PubMed] [Google Scholar]
- Dewel R.A. Colonial origin for eumetazoa: major morphological transitions and the origin of bilaterian complexity. J. Morph. 2000;243:35–74. doi: 10.1002/(SICI)1097-4687(200001)243:1<35::AID-JMOR3>3.0.CO;2-#. doi:10.1002/(SICI)1097-4687(200001)243:1<35::AID-JMOR3>3.0.CO;2-# [DOI] [PubMed] [Google Scholar]
- Drăgulescu A, Yakovenko V.M. Exponential and power-law probability distributions of wealth and income in the United Kingdom and the United States. Physica A. 2001;148:213–221. doi:10.1016/S0378-4371(01)00298-9 [Google Scholar]
- Drake J.W, Charlesworth B, Charlesworth D, Crow J.F. Rates of spontaneous mutation. Genetics. 1998;299:1667–1686. doi: 10.1093/genetics/148.4.1667. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Edelman G.M, Gally J.A. Degeneracy and complexity in biological systems. Proc. Natl Acad. Sci. USA. 2001;98:13 763–13 768. doi: 10.1073/pnas.231499798. doi:10.1073/pnas.231499798 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fichelson P, Audibert A, Simon F, Gho M. Cell cycle and cell-fate determination in Drosophila neural cell lineages. Trends Genet. 2005;21:413–420. doi: 10.1016/j.tig.2005.05.010. doi:10.1016/j.tig.2005.05.010 [DOI] [PubMed] [Google Scholar]
- Flatt T. The evolutionary genetics of canalization. Q. Rev. Biol. 2005;80:287–316. doi: 10.1086/432265. doi:10.1086/432265 [DOI] [PubMed] [Google Scholar]
- Furusawa C, Kaneko K. Theory of robustness of irreversible differentiation in a stem cell system: chaos hypothesis. J. Theor. Biol. 2001;209:395–416. doi: 10.1006/jtbi.2001.2264. doi:10.1006/jtbi.2001.2264 [DOI] [PubMed] [Google Scholar]
- Geard N, Wiles J. A gene network model for developing cell lineages. Artif. Life. 2005;11:249–267. doi: 10.1162/1064546054407202. doi:10.1162/1064546054407202 [DOI] [PubMed] [Google Scholar]
- Goodwin B. C. Scribner's Sons; New York, NY: 1994. How the leopard changed its spots: the evolution of complexity. [Google Scholar]
- Gould S.J. A developmental constraint in Cerion, with comments of the definition and interpretation of constraint in evolution. Evolution. 1989;43:516–539. doi: 10.1111/j.1558-5646.1989.tb04249.x. doi:10.2307/2409056 [DOI] [PubMed] [Google Scholar]
- Gregory T.R. Coincidence, coevolution, or causation? DNA content, cell size, and the C-value enigma. Biol. Rev. 2001;76:65–101. doi: 10.1017/s1464793100005595. doi:10.1017/S1464793100005595 [DOI] [PubMed] [Google Scholar]
- Hogeweg P. Evolving mechanisms of morphogenesis: on the interplay between differential adhesion and cell differentiation. J. Theor. Biol. 2000;203:317–333. doi: 10.1006/jtbi.2000.1087. doi:10.1006/jtbi.2000.1087 [DOI] [PubMed] [Google Scholar]
- Kauffman S.A. Oxford University Press; Oxford, UK: 1993. The origins of order. [Google Scholar]
- Keränen S.V. Simulation study on effects of signaling network structure on the developmental increase in complexity. J. Theor. Biol. 2004;231:3–21. doi: 10.1016/j.jtbi.2004.03.021. doi:10.1016/j.jtbi.2004.03.021 [DOI] [PubMed] [Google Scholar]
- Kim H, Klein R, Majewski J, Ott J. Estimating rates of alternative splicing in mammals and invertebrates. Nat. Genet. 2004;36:915–916. doi: 10.1038/ng0904-915. doi:10.1038/ng0904-915 [DOI] [PubMed] [Google Scholar]
- King N. The unicellular ancestry of animal development. Dev. Cell. 2004;7:313–325. doi: 10.1016/j.devcel.2004.08.010. doi:10.1016/j.devcel.2004.08.010 [DOI] [PubMed] [Google Scholar]
- Lenski R.E, Ofria C, Pennock R.T, Adami C. The evolutionary origin of complex features. Nature. 2003;423:139–144. doi: 10.1038/nature01568. doi:10.1038/nature01568 [DOI] [PubMed] [Google Scholar]
- Maynard Smith J, Szathmáry E. W. H. Freeman Spektrum; Oxford, UK: 1995. The major transitions in evolution. [Google Scholar]
- Maynard Smith J, Burian R, Kauffman S, Alberch P, Campbell J, Goodwin B, Lande R, Raup D, Wolpert L. Developmental constraints and evolution. Q. Rev. Biol. 1985;60:265–287. doi:10.1086/414425 [Google Scholar]
- McShea D.W. Complexity and evolution: what everybody knows. Biol. Phil. 1991;6:303–324. doi:10.1007/BF00132234 [Google Scholar]
- McShea D.W. Mechanisms of large-scale evolutionary trends. Evolution. 1994;48:1747–1763. doi: 10.1111/j.1558-5646.1994.tb02211.x. doi:10.2307/2410505 [DOI] [PubMed] [Google Scholar]
- McShea D.W. The evolution of complexity without natural selection, a possible large-scale trend of the fourth kind. Paleobiology. 2005;31:146–156. doi:10.1666/0094-8373(2005)031[0146:TEOCWN]2.0.CO;2 [Google Scholar]
- Newman S.A. Developmental mechanisms: putting genes in their place. J. Biosci. 2002;27:97–104. doi: 10.1007/BF02703765. [DOI] [PubMed] [Google Scholar]
- Nilsson D.E, Pelger S. A pessimistic estimate of the time required for an eye to evolve. Proc. R. Soc. B. 1994;256:53–58. doi: 10.1098/rspb.1994.0048. doi:10.1098/rspb.1994.0048 [DOI] [PubMed] [Google Scholar]
- Odling-Smee F.J, Laland K.N, Feldman M.W. Niche construction. Am. Nat. 1996;147:641–648. doi:10.1086/285870 [Google Scholar]
- Platzer U, Meinzer H.-P. Genetic networks in the early development of Caenorhabditis elegans. Int. Rev. Cytol. 2004;234:47–100. doi: 10.1016/S0074-7696(04)34002-7. [DOI] [PubMed] [Google Scholar]
- Reinitz J, Sharp D.H. Mechanism of eve stripe formation. Mech. Dev. 1995;49:133–158. doi: 10.1016/0925-4773(94)00310-j. doi:10.1016/0925-4773(94)00310-J [DOI] [PubMed] [Google Scholar]
- Richardson M.K, Chipman A.D. Developmental constraints in a comparative framework: a test case using variations in phalanx number during amniote evolution. J. Exp. Zool. B (Mol. Dev. Evol.) 2003;269:8–22. doi: 10.1002/jez.b.13. doi:10.1002/jez.b.13 [DOI] [PubMed] [Google Scholar]
- Salazar-Ciudad I, Garcia-Fernandez J, Solé R.V. Gene networks capable of pattern formation: from induction to reaction-diffusion. J. Theor. Biol. 2000;205:587–603. doi: 10.1006/jtbi.2000.2092. doi:10.1006/jtbi.2000.2092 [DOI] [PubMed] [Google Scholar]
- Saunders P.T, Ho M.W. On the increase in complexity in evolution. J. Theor. Biol. 1976;63:375–384. doi: 10.1016/0022-5193(76)90040-0. doi:10.1016/0022-5193(76)90040-0 [DOI] [PubMed] [Google Scholar]
- Saunders W.B, Work D.M, Nikolaeva S.V. Evolution of complexity in paleozoic ammonoid sutures. Science. 1999;286:760–763. doi: 10.1126/science.286.5440.760. doi:10.1126/science.286.5440.760 [DOI] [PubMed] [Google Scholar]
- Schlosser G, Wagner G.P, editors. Modularity in development and evolution. University of Chicago Press; Chicago, IL: 2004. [Google Scholar]
- Siegal M.L, Bergman A. Waddington's canalization revisited: developmental stability and evolution. Proc. Natl Acad. Sci. USA. 2002;99:10 528–10 532. doi: 10.1073/pnas.102303999. doi:10.1073/pnas.102303999 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simon H.A. The architecture of complexity. Proc. Am. Philos. Soc. 1962;106:467–482. [Google Scholar]
- Soyer O.S, Bonhoeffer S. Evolution of complexity in signaling pathways. Proc. Natl Acad. Sci. USA. 2006;103:16 337–16 342. doi: 10.1073/pnas.0604449103. doi:10.1073/pnas.0604449103 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Szathmáry E, Jordán F, Pál C. Molecular biology and evolution: can genes explain biological complexity? Science. 2001;292:1315–1316. doi: 10.1126/science.1060852. doi:10.1126/science.1060852 [DOI] [PubMed] [Google Scholar]
- Valentine J.W, Collins A.G, Meyer C.P. Morphological complexity increase in metazoans. Paleobiology. 1994;20:131–142. [Google Scholar]
- Van Valen L. A new evolutionary law. Evol. Theory. 1973;1:1–30. [Google Scholar]
- Waxman D, Peck J.R. The anomalous effects of biased mutation. Genetics. 2003;164:1615–1626. doi: 10.1093/genetics/164.4.1615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wolpert L. The evolution of development. Biol. J. Linn. Soc. 1990;39:109–124. [Google Scholar]
- Yampolsky L.Y, Stoltzfus A. Bias in the introduction of variation as an orienting factor in evolution. Evol. Dev. 2001;3:73–83. doi: 10.1046/j.1525-142x.2001.003002073.x. doi:10.1046/j.1525-142x.2001.003002073.x [DOI] [PubMed] [Google Scholar]
Notice of correction
Figure 1a and references to figure 15 in the electronic supplementary material are now correct. 26 February 2007
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary methods and supplementary figures 7–16





