ABSTRACT
Organismal evolution is a process of discovering better‐fitting phenotypes through trial and error across generations. This iterative process resembles learning processes, an analogy recognized since the 1950s. Recognizing this parallel suggests that evolutionary biology and machine learning can mutually benefit from each other; however, ample opportunities for research into their corresponding concepts remain. In this review, we aim to enhance predictive capabilities and theoretical developments in both fields by exploring their conceptual parallels through specific examples that have emerged from recent advances. We focus on the importance of moving beyond predictions by machine learning approaches for specific cases, but instead advocate for interpretable machine learning approaches for discovering common laws for predicting evolutionary outcomes. This approach seeks to establish a theoretical framework that can transform evolutionary science into a field enriched with predictive theory while also inspiring new modeling and algorithmic strategies in machine learning.
Organismal evolution resembles learning processes in many aspects, but the analogy has not been widely recognized. We discuss new opportunities inspired by this analogy to enhance predictive capabilities and theoretical developments in both research fields, especially using white‐box modeling for expanding predictive theory in evolutionary biology.

1. Introduction
How does the process of organismal evolution resemble “learning” in a broad sense? Although not driven by intention, organisms are engaged in a form of trial and error to find adaptive phenotypes through evolution. By creating phenotypic variations (trials) through mutations and/or phenotypic plasticity, organisms not only create abnormal and lethal phenotypes (errors) but also succeed in finding more adaptive phenotypes. Some of these variants may achieve an enhanced opportunity to survive and reproduce successfully, while others do not fare equally well [1]. Consequently, as such a process repeats across generations, this iterative cycle leads to the evolution of organisms with adaptive or fitted features or traits. This iterative nature of adaptive evolution has been noted to resemble learning processes, which similarly optimize through trial and error to find better solutions [2]. Although this conceptual concurrency has intrigued scientists ever since and has been discussed repetitively for several decades [3, 4, 5, 6, 7, 8, 9, 10, 11] (including a most recent intensive discussion on “How can evolution learn?” by Watson and Szathmáry [9]), its impact on each other's field has not reached its full potential. This is mainly due to the challenges and difficulties in formulating learning processes, leaving the analogy with unclear correspondences between the two. However, this situation is rapidly changing with the astonishing advancements in machine learning technologies, as these have started to allow scientists to replace abstract concepts of learning with more tangible and formalizable implementations [8, 12, 13].
In this review, we will seek for and highlight possible correspondences between machine learning and evolution by referring to a variety of phenomena to find hints to foster new approaches in both fields. Examples include, to name a few, the similarities between overfitting in machine learning and evolutionary trade‐offs, the parallels between the dynamics of Generative Adversarial Networks (GANs) and competition between predator and prey, historicity by training dataset and phylogenetic inertia or evolutionary bias, and so on. Such analogous relationships not only reinforce that machine learning and evolution may operate under similar principles but these could also be leveraged to develop new approaches and understandings in evolutionary studies and new implications for the design of machine learning algorithms.
Recently, an accumulating number of studies have started to utilize machine learning to identify hidden patterns and rules within data, and even to predict evolutionary outcomes [14, 15]; however, the goal of this review is not simply to reaffirm the benefits of applying machine learning in evolutionary studies. These applications, for instance, were often done by machine learning models that lack interpretability (often referred to as “black‐box” models). This lack of interpretability hinders scientists from fully understanding the underlying mechanisms and elucidating the common algorithms that drive these predictions. Consequently, each study may tend to develop an independent model, making it difficult to extract common principles that can be integrated into the extended modern synthesis. On the machine learning side, just like how organismal evolution inspired the development of the genetic algorithm (GA) [16], the potential impact of the analogy would be to bring about new algorithms for machine learning. For example, unlike machine learning, which is based on human concepts, the evolutionary process is a self‐organizing phenomenon driven by enormous numbers of living organisms that have existed in the past and present. This could lead to novel inspirations for developing new algorithms, as well as understanding how and why some algorithms show a high performance but lack an understanding of why they work that way (e.g., stochastic gradient descent [SGD]). Additionally, evolution has the potential to provide insights for algorithmic design that enable machine learning parameters to escape from a locally optimal state and transition to other states for further learning, since organisms diversified by successively shifting from one ecological niche to another.
2. Examples of Analogies Between Machine Learning and Evolution
Here, we will present several specific examples of analogies, ranging from classic to cutting‐edge ones, to help concretely visualize the analogous relationship between machine learning and evolution.
2.1. Genetic Algorithms and Darwinian Evolution
Perhaps the most well‐known example inspired by the analogy is the development of GAs and other evolutionary algorithms (EAs) [17]. GAs learn to find an optimal solution to a problem by iteratively introducing mutations into the set of possible solutions, or population, and selecting which solutions perform better [18]. These algorithms borrow the concept of fitness from evolutionary biology as an objective function to guide the optimization procedure (Figure 1). They mimic how natural selection works in principle during the learning process. In practice, these algorithms are effective especially in problems where the search space is too large for exhaustive searches, such as in exploring the parameters for merging different large language models into a new model with diverse capabilities [19]. Just as natural selection selects for better‐fitted individuals, GAs evaluate and seek for the best or quasi‐best performing solutions by introducing mutations to explore and exploit the solution space, evolving increasingly refined solutions over generations. Various ideas from evolution, such as crossover (partial mixing of solutions), were later incorporated into refined methods in GAs to improve the optimization process [17]. GAs and EAs in turn allowed evolutionary biologists to study the evolutionary origin of complex features [20], modularity [21], and even the development of cancerous tumors in the body through modeling approaches [22].
FIGURE 1.

Genetic algorithm. Genetic algorithm was inspired by the idea of Darwinian selection, which solves a problem by iteratively introducing mutations and selecting the better solutions with higher fitness.
2.2. Overfitting and Evolutionary Trade‐Offs
Learning is not merely a process of memorization; it is about acquiring the ability to generalize over similar cases and make better estimations based on that. Likewise, machine learning models aim to generalize from the training data by estimating hidden patterns, enabling them to make predictions about new, unseen inputs. However, when a model becomes overly specialized to the training data, a situation called “overfitting” occurs, where the model performs extremely well only on the training data but fails to make accurate predictions when encountering new inputs [23]. In overfitting, instead of learning the general patterns, it starts to pick up specific details like noise and exceptions unique to the inputs (Figure 2A). As a result, when a model begins to overfit, it gradually loses the ability to accurately predict outcomes on new data because those specific details and noises do not apply to the unseen data. An extreme case of overfitting can occur when a model merely memorizes each input–output pair in the training data. For example, an overfit model might correctly translate every sentence from its training set into a foreign language but struggle with translating new phrases it has not encountered before. This arises because the overfit model has failed to learn the fundamental and generalizable rules of the language, such as grammar and syntax, from the training data. In more modern machine learning, especially deep learning, this understanding of overlearning is further developed [24, 25].
FIGURE 2.

Analogy between overfitting and evolutionary trade‐offs. (A) Machine learning models are designed to find generalizable solutions when learning from the training data. In this schematic diagram, the model learns to separate two groups of data (green versus purple dots) from each other by the yellow dash line. The yellow dash line separates the two groups fairly well when the model finds a good fit to the data (middle). However, the risk of overfitting increases when the input data is insufficient, biased, or flawed. An overfit model (right) loses the ability to generalize; for example, the model may not be able to properly categorize new data properly (i.e., purple new data being misclassified as green). (B) Organisms that become too adapted to specific habitats may struggle to cope with previously unencountered environmental changes. Ground‐nesting birds, for example, are vulnerable to flooding (unencountered environmental conditions) despite the advantages of nesting at ground level.
In the biological context, similar phenomena can be observed in organismal evolution, where an organism develops specialized traits that help it survive in a specific environment (or niche) but its ability to thrive in a broader range of environments becomes substantially lower [26, 27, 28]. One example is the vulnerability of some organisms to rare selective pressures or rare conditions in their environments that they are not adapted to handle. For instance, some ground‐nesting birds, such as certain species of sparrows and shorebirds, construct their nests close to the ground. This behavior provides multiple advantages, including camouflage from predators [29]. However, occasional flooding would destroy their nests and pose a serious threat to their reproductive success. Despite the obvious risk, many of these birds continue to nest on the ground, as their nesting behavior has been fit to ground environments, with limited plasticity to adjust to rare but catastrophic events like flooding [30, 31]. Even though nesting at higher elevations or in trees could prevent such losses, their ground‐nesting behaviors show that such birds appear to be “overfit” to their ground habitats, which makes them less capable of adapting to rarer conditions in the environments. Consequently, their reproductive success is significantly reduced when confronted with unexpected environmental changes due to such trade‐offs.
Another example analogous to overfitting would be asexual reproduction. Classic theories predicted that asexual organisms (without recombination) are quick in reaching the adaptive state in a multipeaked fitness landscape; however, they often become trapped in suboptimal adaptive peaks [32, 33]. Although asexual reproduction allows organisms to expand rapidly, they often become trapped in suboptimal adaptive conditions, making it difficult to reach the global optimal peak of the landscape [33, 34, 35]. These examples of evolutionary trade‐offs leave organisms vulnerable to new or rare environmental conditions because they become “overfitted” to the conditions they are adapted to, which resembles how an overfitted machine learning model performs well on training data but fails to generalize. An inspiration from this analogy is that understanding of how organisms overcome sub‐optimal conditions may enlighten approaches to avoid overfitting in machine learning.
2.3. GANs and Competition
Another good example of the analogy is seen in a class of machine learning models known as GANs. GANs are a kind of unsupervised learning process consisting of two major components: a “generator” that creates data, and a “discriminator” that evaluates it [36, 37, 38]. During the learning process, the generator strives to produce data that closely mimics the original training data, introducing variations that challenge the discriminator's ability to distinguish between authentic and generated data. This competitive cyclic design allows GANs to generate highly sophisticated output from the original input, such as a photorealistic image from a rough line drawing (Figure 3A) [39]. The adversarial form of the two models has been mathematically generalized, and many variants of GANs have been developed [40, 41, 42].
FIGURE 3.

Analogy between Generative Adversarial Networks (GANs) and competition in evolution. The competitive cyclic design in GANs resembles the competitive nature between predators and prey in evolution. (A) GANs are composed of a generator [G] that creates new images and a discriminator [D] that evaluates whether the newly generated sample [S] is different from the real data [R] or not. During the training process, the generator and the discriminator strive to outcompete each other, so that the generator creates new samples that look sufficiently similar to the real data and the discriminator distinguishes the samples from the real data as much as possible. This iterative process results in models capable of creating photorealistic images, such as the handbag image (right). Illustrations of handbags are adapted from [39]. (B) In nature, predators strive to develop strategies that assist them in catching their prey whereas prey strive to avoid predation by the enemies. As a result, some species evolve morphologies that look sufficiently similar to the environment so that the predators cannot distinguish them from the surroundings. For example, the butterfly evolved a wing pattern that resembles leaves to evade from predation. Illustration of the ground plan of Nymphalid wing patterns and the photo of Kallima inachus are from [44].
This “competition” between the generator and discriminator mirrors evolutionary dynamics between antagonistically interacting species, such as predators and prey, where each species evolves new adaptations across generations to outcompete the other. Often, both predators and prey coevolve rapidly in response to each other's new strategies, such as by learning to outcompete each other's movement or foraging strategies [43]. Similarly, how butterflies have evolved mimetic wing patterns may exemplify this conceptual parallel. Butterflies, as prey, strive to avoid predation from their enemies. Although ancestral Nymphalid butterfly wings might have a simpler ground plan, leaf butterflies (Kallima inachus) evolved wing patterns that closely resemble leaves [44]. To avoid being caught, the evolutionary pressure faced by these butterflies to look as similar to the surroundings as possible mirrors how GANs gradually learn to create photorealistic images from sketches (Figure 3B). In turn, predators such as birds may gradually learn and evolve better ability to distinguish mimetic butterflies, although we note the exact mechanism for this is poorly understood with a relatively evolutionarily conservative visual system in the birds [45, 46, 47, 48]. Similarly, the evolutionary arms race between a parasite or pathogen and its host species also involves such competitive or antagonistic coevolutionary dynamics. Hosts may develop rapid adaptations in their immune system to enhance pathogen detection and responses to parasitic infections. Conversely, parasites might evolve sophisticated immune evasion strategies, such as using surface and secretory proteins that mimic host molecules to avoid detection [49, 50]. Often, genes involved in predator‐prey competition become some of the fastest‐evolving genes in the genome, suggesting the effectiveness of this process during evolution [51, 52].
Such parallel dynamics highlight how the generator and discriminator in GANs and antagonistic species involve a continuous struggle to outcompete each other, where the competition drives the evolution of complex strategies and behaviors, and enhances GANs’ ability to generate convincing data. Simultaneously, this showcases the potential for new approaches being inspired by the analogy between learning and evolutionary processes to be developed, such as simulating possible evolutionary pathways of the leafy pattern from the Nymphalid ground plan. However, it has to be noted that mimesis evolution depends on a complex interaction between the environment and predator cognition, and it can lead to either increased or reduced polymorphism [53, 54, 55, 56, 57, 58]. This indicates that applying the analogy has to be cautious in implementing the machine learning models based on the phenomenon being studied.
2.4. Historicity in Machine Learning and Evolution
Historicity is another phenomenon that can be observed for both learning and evolution. Just as we humans can be biased by what we have learned in the past, machine learning models can be heavily influenced by their training processes and datasets, leading to biases in many different ways. An example is that if a machine learning model is only trained to recognize tumor images of certain demographics, it may perform poorly when presented with images from demographics that are underrepresented in the training data (Figure 4A) [59]. Similarly, if a model is only trained on the faces of cats, it will be biased toward cats, and that it cannot recognize dogs is not surprising at all. It is also clear that such data‐derived historicity sometimes affects the machine learning model as a bias and has a significant impact on its predictions [60].
FIGURE 4.

Both learning and evolution show historicity. (A) Machine learning models can be biased by training data. For example, models trained on datasets of tumor images from certain demographics may have low accuracy in predicting tumor malignancy of people from underrepresented demographics. That is, in these early models, people with dark skin were found to have higher probability to be diagnosed with malignant tumors because the training data may have been biased toward other demographics. Images are used to visualize the concept only: Tumor images are from the “Melanoma Skin Cancer Dataset” on Kaggle; the image of human face is from the “Human Faces” dataset on Kaggle as an example of a demographic underrepresented in the training dataset. (B) Vertebrate eye structure as an example of bias from evolutionary history, or historicity. Although the camera‐type eyes of both vertebrates and cephalopods are evolutionarily elaborate structures, a blind spot remained through evolution on the back side of the eyes in vertebrates which is absent in the eyes in cephalopods. This difference would be due to the different evolutionary history experienced in the cephalopod and vertebrate lineages. Animal silhouettes are from PhyloPic.
The phenomenon of historicity can also be observed in organismal evolution. For example, some traits remain despite the fact that there could be an alternative or possibly more adaptive trait, at least hypothetically. A well‐known example is the presence of a blind spot in the vertebrate eye. Although both vertebrates and cephalopods evolved complex, camera‐type eyes independently, the different evolutionary paths of each group have led to vertebrates having a blind spot, which is absent in cephalopod eyes (Figure 4B) [61]. In vertebrates, the axons of photoreceptor cells in the retina protrude away from the incoming light, resulting in the formation of a blind spot where the optic nerve exits the eye, as no photoreceptor cells are present in that region. By contrast, cephalopods develop their eyes without a blind spot with the axons of their photoreceptor cells protruding into the direction to the brain [61, 62, 63, 64, 65]. The persistence of the blind spot in vertebrates would reflect the influence of historicity in evolution, where the evolutionary paths in the common ancestors shaped traits in the descendants, despite the fact that there could be a better solution. This is obviously similar to how learning can be biased from historical influences. Given the analogy between evolution and learning, it is inevitable that organisms show historicity to some extent, biasing what traits organisms retain and what traits organisms can evolve. Furthermore, it is anticipated how the analogy between learning and evolution can inspire insights into how biases by historicity occur.
2.5. Continual Learning and Exaptation
Another fundamental characteristic of learning is its ability to continuously learn and adapt to new information without forgetting previously acquired knowledge (Figure 5A). Although it was once thought of as a difficult task in machine learning because older models often forget previously learned tasks rapidly when being trained to learn new tasks [66], emerging strategies have started to provide promising solutions to overcome this difficulty [67, 68]. These new approaches have started to better align machine learning models to real‐world learning behaviors and may provide insights into approaches that facilitate the retention of old skills alongside learning to perform new tasks.
FIGURE 5.

Analogy between continual learning and exaptation. Continual learning and exaptation in evolution are analogous in that both learn new things upon what they have learned previously. (A) Continual learning without forgetting previously learned knowledge is a general feature of learning by humans. Recently, it has been demonstrated that machine learning models are also able to achieve this by newly devised approaches. In continual learning, machine learning models gain the ability to predict outcomes in new tasks (highlighted by the ! marks) without losing the predictive ability in previously learned tasks. (B) Evolution achieves continuous acquisition of new adaptations. In some cases, functions of previously evolved traits may be retained. For example, it is now generally considered that feather was first evolved for thermal regulation. Although subsequent changes in the Shh‐BMP2 signaling allowed for the development and emergence of diverse feather types including those that enable flying abilities, feathers for thermal regulation (i.e., down feather) can still be seen in extant birds. Illustrations are modified based on [70, 71, 73].
Such kind of continual learning is obviously a characteristic of biological evolution, where organisms continuously evolve new adaptations often based on modifications on their existing traits. This process, known as exaptation in evolution, reutilizes old traits that were fitted to a different environment; however, the emerging new traits often show additional or different functions from the original structures. For instance, feathers of birds are thought to have originated from epidermal structures unrelated to flight in their reptilian ancestors [69, 70]. Their common ancestor was likely to have keratin‐based scales covering their skin for protection like an armor. But through evolution toward the dinosaur and avian lineage, scales gave rise to structures that initially enhanced thermal regulation, eventually evolving into the down feathers seen in extant birds. Over time, additional modifications led to the diversification of feathers, including asymmetrical flight feathers that enabled powered flight. Importantly, while feathers adapted for flight lost their original thermoregulatory function, many birds still retain down feathers for insulation alongside flight feathers. This example of birds evolving a variety of specialized feather types while preserving more ancient structures [70, 71, 72, 73, 74, 75] parallels the underlying characteristic of learning processes. Just as learning systems acquire new functions without necessarily erasing previous knowledge, avian evolution involved the addition of new feather types (such as flight feathers) while still maintaining thermoregulatory feathers. This process exemplifies how evolutionary innovations can lead to the coexistence of both novel and ancestral traits over time, which is similar to the underlying characteristic of continual learning processes.
Interestingly, as biological exaptations are considered to build new structures based on existing traits, recent studies also found molecular evidence supporting this idea. For example, it was found that the evolution of feathers likely involved the re‐purposed use of existing genes, or gene co‐option, such as Shh‐Bmp2 signaling pathways which enabled scales to evolve into highly branched scales, or feathers [71, 72]. Similarly, in the evolution in beetles, while the hindwings have been maintained for their flying ability, beetles also evolved external elytra, which are hardened wing covers that protect their bodies and aid in thermal regulation. Similar to the case of feather development and evolution, the acquisition of elytra was found to involve the co‐option of genes originally involved in exoskeleton formation [76, 77]. In both cases, these adaptations demonstrate how organisms can evolve novel traits from existing traits, sometimes by utilizing existing modules. Collectively, these processes enable organisms to adapt and evolve new functions by leveraging preexisting genetic and phenotypic structures.
2.6. Reinforcement Learning and Evolution to Maximize Fitness
Learning can also proceed in a way to maximize its reward for certain tasks. For example, when learning to play a video game, a person strives to achieve as high scores as possible by trying different strategies through trial and error. Having a goal to maximize reward has been found practical in recent machine learning approaches known as reinforcement learning [23, 78], and these studies have led to the powerful AlphaGo [79] that beat the top human players of Go chess in the world. Specifically, in reinforcement learning, an agent is trained to make better decisions by maximizing rewards through trial‐and‐error interactions with their environment [78]. For example, to train an agent to play a game like Mario World, at the beginning of the learning process, the agent usually loses the game very quickly because it does not know how to avoid traps and enemies. However, after repeated rounds of trials and errors, the agent will start to know how to walk, run, and “realize” that jumping to hit the bricks will get the coins and get higher scores. The agent will also gradually know where to start jumping in order to kick off the turtle enemies, avoid traps, and finally win the game. This ability is achieved by learning which decision (walk, jump, or stay still, etc.) at each time point of the game would maximize the chance to win the game and have the final reward.
The analogy in biological evolution is evident as it not only mimics the behavior of organisms but is also driven by maximizing the reward, which is similar to evolutionary processes where organismal populations tend to end up evolving or fitting into better‐fitted niches, where higher fitness is often attained. For example, the complex and beautiful patterns of train feathers in peacocks evolved because the patterns were associated with higher fitness, and the selection from females worked as feedback to evolve more beautiful patterns.
Interestingly, the aimlessness of noise added in machine learning may bear resemblance to mutations in biological evolution. During learning guided by SGD (a widely used method for optimizing loss functions), random noise is introduced into the process. Despite their apparent aimlessness, this noise often improves learning by helping models escape local minima or avoid overfitting to uninformative features [80, 81]. This is conceptually similar to mutations in evolution, which occur without specific direction but introduce variations that allow populations to explore new adaptive peaks. Despite their randomness, both noise and mutations help to navigate complex search spaces in parameter optimization or evolutionary fitness landscapes.
In addition, in reinforcement learning, when deciding the next actions to achieve higher reward, the learning agents often face the conflicting problem of whether to explore new strategies more, or explore less but instead exploit current knowledge to achieve the highest possible reward based on what it has already learned. The benefit of explorations in learning processes is similar to how increased mutations may lead to the generation of more variable phenotypes for selection, whereas mutations may also cause harmful or lethal traits to the organisms. These similarities suggest that learning and biological evolution in general may share a lot of underlying principles, and in addition, insights from each side may also inspire the other field to achieve a better understanding of how the learning process or evolutionary process actually works.
2.7. Differences Between Learning and Evolutionary Processes
We have explored six concrete examples to illustrate the analogous relationship between (machine) learning and evolution above. Meanwhile, it is noteworthy that differences do exist between the two, especially because a wide variety of new techniques and artificial optimizations to improve algorithmic efficiencies have been introduced [82, 83]. Essentially these do not have direct counterparts in nature. For instance, in modern GAs, mutations and recombinations are designed to efficiently search the solution space and optimize performance as rapidly as possible. These artificially engineered mechanisms enhance computational efficiency but do not necessarily reflect the stochastic and potentially constrained nature of evolutionary processes in biological systems. Thus, caution is required when drawing corresponding counterparts between GAs and biological evolution. Similarly, in GANs, although the competitive dynamic between the generator and discriminator is conceptually similar to coevolutionary interactions between predators and prey, GANs differ from organismal relationship since developments of techniques to enable accurate predictions and efficient generation of photorealistic images with high speed have been introduced [84, 85]. Another point that is also noteworthy is that while machine learning aims to actively seek for ways to lower the loss function or to achieve higher rewards (such as in reinforcement learning), biological organisms do not have a specific aim in evolution. These suggest that while machine learning and biological evolution share many conceptual similarities, caution is needed when conducting studies based on the analogy.
3. White‐Box and Explainable Machine Learning as Potential Tools to Expand Modern Evolutionary Theories
3.1. Applying the Analogy to Predict Evolution
Given the analogous relationship between learning and evolutionary processes discussed in section 2, one exciting implication is that models trained to learn the evolution of biological organisms or their phenotypic evolution may be able to predict their evolutionary outcomes. Although several pioneering studies have attempted to devise new approaches inspired by the analogy to model evolution (such as to utilize formulations of machine learning process to model biological evolution in Vanchurin et al. [12], and to model cancer evolution in Lahoz‐Beltra et al. [22]), how the analogy can be operationalized to predict various evolutionary phenomena remains largely unexplored. This idea echoes a long‐standing challenge in evolutionary biology, as evolution has often been viewed as a study of biological histories whereas predictive theories of contemporary and future evolution, especially those predicting how phenotypes evolve, remain scarce. Further research is awaited to investigate which evolutionary phenomena are more effectively analyzed using such analogy‐based approaches.
In particular, it remains largely unknown and underexplored what kind of common mechanisms can be found for predicting evolutionary outcomes. In this regard, even though discussions about the predictability of evolutionary outcomes have surfaced sporadically [86, 87, 88], only recently have studies begun to propose and support potential mechanisms that govern phenotypic evolution. For example, phenotypes expressing many pleiotropic genes (those used in multiple developmental processes) [89] and exhibiting high developmental stability (embryos with fewer fluctuations in gene expression level) [90, 91] correlated with its evolutionary conservation. Although we anticipate that further utilizing these clues may help the development of predictive theories of evolution, it is possible that the analogy can be utilized to uncover biological features that are useful in understanding the common mechanisms in predicting phenotypic evolution. In other words, given the analogy between learning and evolution, understanding how machine learning models predict phenotypic evolution may reveal the common mechanisms to predict biological evolution.
3.2. Limitations of Current “Black‐Box” Machine Learning Approaches
Importantly, although pioneering attempts in predicting phenotypic changes have started to emerge in recent years, the progress is still largely confined to specific cases, such as artificial selection in agricultural crops [92, 93] and the evolution of antibiotic resistance in bacteria [94, 95]. A major limitation is that such models may only be applicable to specific cases, making it difficult to derive general laws or biological principles that could contribute to a broader predictive framework in evolutionary biology (e.g., to expand the extended modern synthesis). This is due to the difficulty in understanding how black‐box machine learning approaches predict. For instance, machine learning was used to predict which variants of human influenza viruses are more likely to persist and dominate in the near future [14]; however, even though the prediction did not involve a highly complex model, elucidation of the reason why the predictions were accurate or trustworthy remained difficult (Figure 6A). Although achieving high accuracy of prediction has excited many scientific disciplines, explainability of machine learning models in biological terms remains a major barrier for scientists to further expand the findings into a common mechanism behind these, or to formulate a general theory of evolutionary prediction. Therefore, in leveraging the analogy between learning and evolution to predict biological evolution, an important consideration is not only whether the model can predict evolution but also we can understand the algorithmic logic and extract features behind how the evolutionary predictions are made in a biologically interpretable way.
FIGURE 6.

White‐box modeling enables the interpretability of predictions. (A) Schematic illustrations of black‐box machine learning models. Even though high accuracy of predictions can often be achieved, it remains difficult to understand why the model makes certain predictions. Nodes that are not directly interpretable in the intermediate layers are colored in gray. (B) By contrast, white‐box modeling allows for higher interpretability of predictions because the architecture of a white‐box model is formed by nodes and edges that are biologically interpretable (e.g., representing known biological processes). By utilizing explainable machine learning techniques, it becomes possible to trace which nodes and edges (and therefore, which biological processes) are important for making predictions. As an example, important nodes are highlighted in darker green colors. Edges (relationships between biological processes) colored in magenta are more important for making predictions than the ones in blue. Nodes and edges marked by grayish dashes contribute less to the prediction output.
3.3. White‐Box Models Have the Potential to Extract Common Biological Features Driving Predictions
To overcome the limitations of black‐box models, recent studies have set out to look for approaches that support interpretability in biological terms while simultaneously being able to achieve high accuracy in prediction. This has brought a new, contrasting approach called “white‐box” modeling to growing attention [96]. Unlike black‐box models, where their internal decision‐making processes are often obscured (Figure 6A), white‐box models are transparent and interpretable. White‐box models are often made up of nodes that are interpretable, and the connections between the nodes could be imposed by explicit rules (Figure 6B) (such as models imposed by physics rules in physics‐informed machine learning models [97]). In other words, such design in the machine learning model architecture allows their inner workings to be understood. Given the analogy between learning and evolution, our speculation is that such a biologically interpretable machine learning approach may not only allow for predictions of evolutionary outcomes but can also enable the identification of key biological features that drive those predictions.
In practice, while still being an emerging approach, such biologically interpretable or biologically informed neural networks have started to bring about new discoveries in other fields of biological sciences. These pioneering studies utilized biologically informed models composed of nodes representing known biological pathways or processes, such as those representing gene functions, Gene Ontology, or KEGG pathways. After training the model, nodes and connections in the neural network (which correspond to specific biological functions) that are important for making the prediction can be identified by explainable machine learning techniques (Figure 6B). To note, in biomedical sciences, such transparency in machine learning models is particularly high in clinical applications because doctors are often unable to tell why certain predictions can be made by machine learning software, making it impossible to evaluate whether the prediction is trustworthy or not (such as in judging whether a radiological image shows the signs of cancer) [98, 99]. These urgent needs have called for the rapid development of white‐box and explainable machine learning approaches, and some pioneering studies have shown that such approaches could identify previously unknown but biological factors (such as genes) important for making biomedical predictions [100, 101, 102, 103, 104, 105].
For instance, the study by Elmarakeby et al. developed a white‐box model for predicting prostate cancer from the patients’ omics profiles and they successfully identified new predictor genes and biological pathways for the disease [101]. In this study, a neural network comprised of known biological pathways was trained to learn the characteristics of omics data from healthy and cancer patients. By analyzing which components in the network drive the predictions, this led to the discovery of a previously unknown marker MDM4 and its related biological pathway as novel clinical predictors of prostate cancer [101]. Specifically, the biological network information in the Reactome database was first transformed into a neural network architecture. Next, during the training phase, only the edges (connections in the neural network) representing known pathway relationships could be trained to reveal which nodes are important in making the prediction [101]. Although this seems to limit which network components can be trained, interestingly in this study, the model achieved higher prediction accuracy compared to other dense and black‐box models, although whether high prediction accuracy is a general trend requires further investigation. Likewise, this strategy has been deployed in proteomics and epigenetics to identify key proteins associated with disease states [103] and candidate pathways that explain the epigenetic clocks [105]. Moreover, to account for incomplete biological knowledge in the databases and network structures, an improved strategy was devised in Lotfollahi et al. to address this limitation, using a flexible encoder‐decoder architecture that allows for trimming and modification of pathways when the machine learning model learns which biological pathways may underlie the identity of specific cell types based on their single‐cell expression profiles [102].
Therefore, it becomes increasingly tempting to utilize these explainable white‐box modeling and devise innovative approaches inspired by the machine learning–evolution analogy to expand the current approaches on evolutionary prediction. Importantly, this goes beyond mere prediction as the inner workings of the prediction model may help us understand the underlying mechanisms for major evolutionary trends. Without interpretability in the prediction models, even if we train models using sufficient amounts and highly diversified data from millions of species, black‐box models still do not allow us to expand current evolutionary theories because the common tendencies or rules underlying evolution cannot be extracted from such models for specific cases. That is, each black‐box model may perform very well in predicting the evolutionary outcomes, but it may only be applied in the specific cases because no rules commonly underlying different cases can be extracted. By contrast, the white‐box architecture may allow us to extract the common algorithmic logic driving the predictions from different models. Such common features or algorithmic logic may provide new hypotheses that are testable and may be used to extend evolutionary theories. Considering this as an emerging approach, it is important for future studies to start accumulating evidence whether predictions inspired by the analogy can extract common biological features that correlate with or potentially contribute to predicting evolutionary outcomes. Although it is impossible to directly test predictions that happen millions or billions of years from now on, cases of more rapid evolution or those of shorter timescales could be the starting points of such studies. By repetitively trying out such predictions, biologists would be able to start to identify which common biological features can make reliable predictions.
3.4. Limitations of the Analogy for Predicting Evolution
Although the analogy between learning and evolution offers a promising framework for introducing predictive theories in evolutionary biology, it is important to note that additional efforts are required to effectively apply this analogy to evolutionary studies.
A primary challenge lies in operationalizing this analogy to extract common biological mechanisms for predicting evolutionary outcomes. Although the conceptual analogy has been extensively discussed for decades, and we have explored additional concrete examples comparing with machine learning, how to practically implement white‐box approaches that can learn evolution from empirical data needs further studies. In particular, technical challenges are often confronted when bridging the gap between machine learning's engineering demands and biological data. These include, for example, the limited quantity of biological experimental data for training, the scarcity of appropriate datasets for validating machine learning‐derived long‐term evolutionary predictions, and the difficulty in constructing hierarchical parameters with biological reliability. Pioneering studies in other biological fields that made the first attempts to devise practical white‐box machine learning approaches would provide hints for overcoming these challenges (e.g., Yang et al. Cell 2019 [100]; Elmarakeby et al. Nature 2021 [101]; Lotfollahi et al. Nature Cell Biology 2023 [102] that uncovered previously unknown biological networks underlying antibiotic actions, cancer, or cell types by using white‐box machine learning).
4. Conclusions
Although the analogous relationships between learning and evolutionary processes have been discussed over decades, we expanded these using specific examples from recent advancements in machine learning and evolutionary science. These include overfitting and evolutionary trade‐offs, similar iterative nature shared by competition in evolution and GANs, historicities in both processes, continual learning and exaptation, and higher reward in reinforcement learning and maximizing fitness in evolution. Importantly, while these specific examples are expected to inspire both fields, such as providing new research strategies, the primary importance of the analogy relationship does not lie in the mere application of machine learning. Especially in the field of evolutionary biology, which primarily focuses on past events, the utilization of this analogy is anticipated to introduce a theoretical framework for predicting evolutionary outcomes in various phenomena. This could be achieved particularly through the use of interpretable machine learning approaches, such as white‐box machine learning. This is because interpretable machine learning allows scientists to identify common factors and mechanisms behind the predictions of different evolutionary phenomena, whereas the mere application of black‐box machine learning often leads to case‐specific predictions.
Author Contributions
Conceptualization: Jason Cheok Kuan Leong and Naoki Irie. Writing: Jason Cheok Kuan Leong and Naoki Irie wrote the initial draft with substantial input from Masaaki Imaizumi and Hideki Innan. All authors have agreed to the submitted and revised versions. Figures: Jason Cheok Kuan Leong and Naoki Irie with substantial input from Masaaki Imaizumi and Hideki Innan. Supervision: Naoki Irie. Funding acquisition: all authors.
Notes on the Use of AI in the Manuscript
ChatGPT (GPT‐4o) was utilized to improve the phrasing and word choices of the manuscript. Perplexity Pro and SciSpace were used to collect research articles related to the topics, in addition to a more manual literature search using Google Scholar, PubMed, and Web of Science. Literature collected by the AI tools was thoroughly read and investigated by all authors, and only those that properly support the arguments were cited. Despite the involvement of various AI tools in improving the manuscript, the entire conception of the manuscript and the arguments are originally from the authors. Parts of the illustrations are from online resources including PhyloPic, Wikimedia Commons, Kaggle, as well as from published research articles based on CC BY 4.0. To illustrate the abstract concepts presented, the sketch drawings in Figures 2, 3, and 5A were created with assistance from Adobe Photoshop and Firefly (generative AI) and were guided by images of relevant species from the Internet. The images were then further modified manually by the authors to ensure biological accuracy.
Conflicts of Interest
The authors declare no conflicts of interest.
Acknowledgments
The authors thank RCIES, Division of Evolutionary Studies of Complex Systems for supporting this study (grant Irie_2024‐2025). The authors also thank the Japan Society for the Promotion of Science (JSPS) KAKENHI, (Grant‐in‐Aid for Early‐Career Scientists) 24K18170, (Grant‐in‐Aid for Scientific Research) 23K27227, and 24K02904 for the support. The authors also thank Reviewer #2 for the insightful comments.
Funding: This study was funded by the RCIES, Division of Evolutionary Studies of Complex Systems (grant Irie_2024‐2025) and Japan Society for the Promotion of Science (JSPS) KAKENHI, (Grant‐in‐Aid for Early‐Career Scientists) 24K18170, (Grant‐in‐Aid for Scientific Research) 23K27227, and 24K02904.
Contributor Information
Jason Cheok Kuan Leong, Email: jason_leong@soken.ac.jp.
Naoki Irie, Email: irie_naoki@soken.ac.jp.
Data Availability Statement
Data sharing is not applicable to this article, as no new data were created or analyzed in this study.
References
- 1. Futuyma D. J. and Kirkpatrick M., Evolution (Sinauer Associates, 2023). [Google Scholar]
- 2. Pringle J. W. S., “On the Parallel Between Learning and Evolution,” Behaviour 3, no. 1 (1951): 174–214, 10.1163/156853951x00269. [DOI] [Google Scholar]
- 3. Blute M., “Learning Theory and the Evolutionary Analogy,” Cogprints (1979), http://cogprints.org/858/. [Google Scholar]
- 4. Goldberg D. E. and Holland J. H., “Genetic Algorithms and Machine Learning,” Machine Learning 3, no. 2–3 (1988): 95–99, 10.1023/a:1022602019183. [DOI] [Google Scholar]
- 5. Dennett D. C., Darwin's Dangerous Idea (Simon & Schuster, 1995). [Google Scholar]
- 6. Börgers T. and Sarin R., “Learning Through Reinforcement and Replicator Dynamics,” Journal of Economic Theory 77, no. 1 (1997): 1–14, 10.1006/jeth.1997.2319. [DOI] [Google Scholar]
- 7. Valiant L., Probably Approximately Correct: Nature's Algorithms for Learning and Prospering in a Complex World (Basic Books, Inc, 2013). [Google Scholar]
- 8. Watson R. A., Wagner G. P., Pavlicev M., Weinreich D. M., and Mills R., “The Evolution of Phenotypic Correlations and “Developmental Memory”,” Evolution 68, no. 4 (2014): 1124–1138, 10.1111/evo.12337. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Watson R. A. and Szathmáry E., “How Can Evolution Learn?” Trends in Ecology & Evolution 31, no. 2 (2016): 147–157, 10.1016/j.tree.2015.11.009. [DOI] [PubMed] [Google Scholar]
- 10. Kouvaris K., Clune J., Kounios L., Brede M., and Watson R. A., “How Evolution Learns to Generalise: Using the Principles of Learning Theory to Understand the Evolution of Developmental Organisation,” PLoS Computational Biology 13, no. 4 (2017): 1005358, 10.1371/journal.pcbi.1005358. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Szilágyi A., Szabó P., Santos M., and Szathmáry E., “Phenotypes to Remember: Evolutionary Developmental Memory Capacity and Robustness,” PLoS Computational Biology 16, no. 11 (2020): 1008425, 10.1371/journal.pcbi.1008425. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Vanchurin V., Wolf Y. I., Katsnelson M. I., and Koonin E. V., “Toward a Theory of Evolution as Multilevel Learning,” Proceedings of the National Academy of Sciences of the United States of America 119, no. 6 (2022): 2120037119, 10.1073/pnas.2120037119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Watson R. A., Mills R., Buckley C. L., et al., “Evolutionary Connectionism: Algorithmic Principles Underlying the Evolution of Biological Organisation in Evo‐Devo, Evo‐Eco and Evolutionary Transitions,” Evolutionary Biology 43, no. 4 (2016): 553–581, 10.1007/s11692-015-9358-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Hayati M., Biller P., and Colijn C., “Predicting the Short‐Term Success of Human Influenza Virus Variants With Machine Learning,” Proceedings of the Royal Society B: Biological Sciences 287, no. 1924 (2020): 20200319, 10.1098/rspb.2020.0319. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Konno N. and Iwasaki W., “Machine Learning Enables Prediction of Metabolic System Evolution in Bacteria,” Science Advances 9, no. 2 (2023): adc9130, 10.1126/sciadv.adc9130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Holland J. H., Adaptation in Natural and Artificial Systems (The University of Michigan Press, 1975). [Google Scholar]
- 17. Jong K. D., Fogel D. B., and Schwefel H.‐P., Handbook of Evolutionary Computation (CRC Press, 1997), 10.1201/9780367802486. [DOI] [Google Scholar]
- 18. Foster J. A., “Evolutionary Computation,” Nature Reviews Genetics 2, no. 6 (2001): 428–436, 10.1038/35076523. [DOI] [PubMed] [Google Scholar]
- 19. Akiba T., Shing M., Tang Y., Sun Q., and Ha D., “Evolutionary Optimization of Model Merging Recipes” Nature Machine Intelligence, 7 (2025): 195–204, 10.1038/s42256-024-00975-8. [DOI] [Google Scholar]
- 20. Lenski R. E., Ofria C., Pennock R. T., and Adami C., “The Evolutionary Origin of Complex Features,” Nature 423, no. 6936 (2003): 139–144, 10.1038/nature01568. [DOI] [PubMed] [Google Scholar]
- 21. Clune J., Mouret J.‐B., and Lipson H., “The Evolutionary Origins of Modularity,” Proceedings of the Royal Society B: Biological Sciences 280, no. 1755 (2013): 20122863, 10.1098/rspb.2012.2863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Lahoz‐Beltra R. and Rodriguez R. J., “Modeling a Cancerous Tumor Development in a Virtual Patient Suffering From a Depressed State of Mind: Simulation of Somatic Evolution With a Customized Genetic Algorithm,” BioSystems 198 (2020): 104261, 10.1016/j.biosystems.2020.104261. [DOI] [PubMed] [Google Scholar]
- 23. Murphy K. P., Probabilistic Machine Learning: An Introduction (The MIT Press, 2022). [Google Scholar]
- 24. Belkin M., Hsu D., Ma S., and Mandal S., “Reconciling Modern Machine‐Learning Practice and the Classical Bias–Variance Trade‐Off,” Proceedings of the National Academy of Sciences of the United States of America 116, no. 32 (2019): 15849–15854, 10.1073/pnas.1903070116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Zhang C., Bengio S., Hardt M., Recht B., and Vinyals O., “Understanding Deep Learning (Still) Requires Rethinking Generalization,” Communications of the ACM 64, no. 3 (2021): 107–115, 10.1145/3446776. [DOI] [Google Scholar]
- 26. Futuyma D. J. and Moreno G., “The Evolution of Ecological Specialization,” Annual Review of Ecology and Systematics 19, no. 1 (1988): 207–233, 10.1146/annurev.es.19.110188.001231. [DOI] [Google Scholar]
- 27. Kassen R., “The Experimental Evolution of Specialists, Generalists, and the Maintenance of Diversity,” Journal of Evolutionary Biology 15, no. 2 (2002): 173–190, 10.1046/j.1420-9101.2002.00377.x. [DOI] [Google Scholar]
- 28. Vamosi J. C., Armbruster W. S., and Renner S. S., “Evolutionary Ecology of Specialization: Insights From Phylogenetic Analysis,” Proceedings of the Royal Society B: Biological Sciences 281, no. 1795 (2014): 20142004, 10.1098/rspb.2014.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Greenberg R. G., Elphick C., Nordby J. C., et al., “Flooding and Predation: Trade‐Offs in the Nesting Ecology of Tidal‐Marsh Sparrows,” Studies in Avian Biology 32 (2006): 96–109. [Google Scholar]
- 30. Bailey L. D., Ens B. J., Both C., Heg D., Oosterbeek K., and Pol M. v. d., “No Phenotypic Plasticity in Nest‐Site Selection in Response to Extreme Flooding Events,” Philosophical Transactions of the Royal Society B: Biological Sciences 372, no. 1723 (2017): 20160139, 10.1098/rstb.2016.0139. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Maslo B., Schlacher T. A., Weston M. A., et al., “Regional Drivers of Clutch Loss Reveal Important Trade‐Offs for Beach‐Nesting Birds,” PeerJ 4 (2016): 2460, 10.7717/peerj.2460. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Crow J. F. and Kimura M., “Evolution in Sexual and Asexual Populations,” The American Naturalist (1965): 439–450, https://www.jstor.org/stable/2459132. [Google Scholar]
- 33. Eshel I. and Feldman M. W., “On the Evolutionary Effect of Recombination,” Theoretical Population Biology 1, no. 1 (1970): 88–100, 10.1016/0040-5809(70)90043-2. [DOI] [PubMed] [Google Scholar]
- 34. Watson R. A., Weinreich D. M., and Wakeley J., “Genome Structure and the Benefit of Sex,” Evolution 65, no. 2 (2011): 523–536, 10.1111/j.1558-5646.2010.01144.x. [DOI] [PubMed] [Google Scholar]
- 35. Crona K., “Recombination and Peak Jumping,” PLoS ONE 13, no. 3 (2018): 0193123, 10.1371/journal.pone.0193123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Goodfellow I. J., Pouget‐Abadie J., and Mirza M., et al., “Generative Adversarial Networks,” in Advances in Neural Information Processing Systems 27 (2014): 2672–2680, 10.1145/3422622. [DOI] [Google Scholar]
- 37. Goodfellow I., Pouget‐Abadie J., Mirza M., et al., “Generative Adversarial Networks,” Communications of the ACM 63, no. 11 (2020): 139–144, 10.1145/3422622. [DOI] [Google Scholar]
- 38. Gui J., Sun Z., Wen Y., Tao D., Ye J., and Gui J., “A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications,” IEEE Transactions on Knowledge and Data Engineering 35, no. 4 (2023): 3313–3332, 10.1109/tkde.2021.3130191. [DOI] [Google Scholar]
- 39. Isola P., Zhu J.‐Y., Zhou T., and Efros A. A., “Image‐to‐Image Translation With Conditional Adversarial Networks,” in IEEE Conference on Computer Vision and Pattern Recognition , (2017), 5967–5976, 10.1109/CVPR.2017.632. [DOI]
- 40. Arjovsky M., Chintala S., and Bottou L., “Wasserstein Generative Adversarial Networks,” in Proceedings of the 34th International Conference on Machine Learning, Proceedings of Machine Learning Research , Eds. 1 Precup D. & Teh Y. W., Vol. 70 (2017), 214–223, https://proceedings.mlr.press/v70/arjovsky17a.html.
- 41. Nowozin S., Cseke B., and Tomioka R., “f‐GAN: Training Generative Neural Samplers Using Variational Divergence Minimization,” Advances in Neural Information Processing Systems 29 (2016): 271–279. [Google Scholar]
- 42. Li C.‐L., Chang W.‐C., Cheng Y., Yang Y., and Póczos B., “MMD GAN: Towards Deeper Understanding of Moment Matching Network,” Advances in Neural Information Processing Systems 30, (2017): 2203–2213. [Google Scholar]
- 43. Netz C., Hildenbrandt H., and Weissing F. J., “Complex Eco‐Evolutionary Dynamics Induced by the Coevolution of Predator–Prey Movement Strategies,” Evolutionary Ecology 36, no. 1 (2022): 1–17, 10.1007/s10682-021-10140-x. [DOI] [Google Scholar]
- 44. Suzuki T. K., Tomita S., and Sezutsu H., “Gradual and Contingent Evolutionary Emergence of Leaf Mimicry in Butterfly Wing Patterns,” BMC Evolutionary Biology 14, no. 1 (2014): 229, 10.1186/s12862-014-0229-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Lind O., Henze M. J., Kelber A., and Osorio D., “Coevolution of Coloration and Colour Vision?” Philosophical Transactions of the Royal Society B: Biological Sciences 372, no. 2017: 20160338, 10.1098/rstb.2016.0338. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Dell'Aglio D. D., Troscianko J., McMillan W. O., Stevens M., and Jiggins C. D., “The Appearance of Mimetic Heliconius Butterflies to Predators and Conspecifics,” Evolution; International Journal of Organic Evolution 72, no. 10 (2018): 2156–2166, 10.1111/evo.13583. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Krishnan A., Singh A., and Tamma K., “Visual Signal Evolution Along Complementary Color Axes in Four Bird Lineages,” Biology Open 9, no. 9 (2020): bio052316, 10.1242/bio.052316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Huang G., Zhang Y., Zhang W., and Wei F., “Genetic Mechanisms of Animal Camouflage: An Interdisciplinary Perspective,” Trends in Genetics 40, no. 7 (2024): 613–620, 10.1016/j.tig.2024.03.009. [DOI] [PubMed] [Google Scholar]
- 49. Elde N. C. and Malik H. S., “The Evolutionary Conundrum of Pathogen Mimicry,” Nature Reviews Microbiology 7, no. 11 (2009): 787–797, 10.1038/nrmicro2222. [DOI] [PubMed] [Google Scholar]
- 50. Schmid‐Hempel P., Evolutionary Parasitology: The Integrated Study of Infections, Immunology, Ecology, and Genetics (Oxford University Press, 2021). [Google Scholar]
- 51. Paterson S., Vogwill T., Buckling A., et al., “Antagonistic Coevolution Accelerates Molecular Evolution,” Nature 464, no. 7286 (2010): 275–278, 10.1038/nature08798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Nair R. R., Vasse M., Wielgoss S., Sun L., Yu Y.‐T. N., and Velicer G. J., “Bacterial Predator‐Prey Coevolution Accelerates Genome Evolution and Selects on Virulence‐Associated Prey Defences,” Nature Communications 10, no. 1 (2019): 4301, 10.1038/s41467-019-12140-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Goldschmidt R. B., “Mimetic Polymorphism, a Controversial Chapter of Darwinism,” The Quarterly Review of Biology 20, no. 2 (1945): 147–164, 10.1086/394785. [DOI] [Google Scholar]
- 54. Joron M. and Mallet J. L. B., “Diversity in Mimicry: Paradox or Paradigm?” Trends in Ecology & Evolution 13, no. 11 (1998): 461–466, 10.1016/s0169-5347(98)01483-9. [DOI] [PubMed] [Google Scholar]
- 55. Bond A. B. and Kamil A. C., “Visual Predators Select for Crypticity and Polymorphism in Virtual Prey,” Nature 415, no. 6872 (2002): 609–613, 10.1038/415609a. [DOI] [PubMed] [Google Scholar]
- 56. Bond A. B. and Kamil A. C., “Spatial Heterogeneity, Predator Cognition, and the Evolution of Color Polymorphism in Virtual Prey,” in Proceedings of the National Academy of Sciences of the United States of America , Vol. 103, no. 9 (2006), 3214–3219, 10.1073/pnas.0509963103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Wang S., Teng D., Li X., et al., “The Evolution and Diversification of Oakleaf Butterflies,” Cell 185, no. 17 (2022): 3138–3152, 10.1016/j.cell.2022.06.042. [DOI] [PubMed] [Google Scholar]
- 58. Llaurens V., Poul Y. L., Puissant A., Blandin P., and Debat V., “Convergence in Sympatry: Evolution of Blue‐Banded Wing Pattern in Morpho Butterflies,” Journal of Evolutionary Biology 34, no. 2 (2021): 284–295, 10.1111/jeb.13726. [DOI] [PubMed] [Google Scholar]
- 59. Adamson A. S. and Smith A., “Machine Learning and Health Care Disparities in Dermatology,” JAMA Dermatology 154, no. 11 (2018): 1247, 10.1001/jamadermatol.2018.2348. [DOI] [PubMed] [Google Scholar]
- 60. Sap M., Card D., Gabriel S., Choi Y., and Smith N. A., “The Risk of Racial Bias in Hate Speech Detection,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics , (2019), 1668–1678, 10.18653/v1/p19-1163. [DOI]
- 61. Nilsson D.‐E., Johnsen S., and Warrant E., “Cephalopod Versus Vertebrate Eyes,” Current Biology 33, no. 20 (2023): R1100–R1105, 10.1016/j.cub.2023.07.049. [DOI] [PubMed] [Google Scholar]
- 62. Katz P. S. and Lyons D. C., “Cephalopod Vision: How to Build a Better Eye,” Current Biology 33, no. 1 (2023): R27–R30, 10.1016/j.cub.2022.11.054. [DOI] [PubMed] [Google Scholar]
- 63. Harris W. A., “Pax‐6: Where to Be Conserved Is Not Conservative,” Proceedings of the National Academy of Sciences of the United States of America, 94, no. 6 (1997): 2098–2100, 10.1073/pnas.94.6.2098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. Gehring W. J. and Ikeo K., “Pax 6: Mastering Eye Morphogenesis and Eye Evolution,” Trends in Genetics 15, no. 9 (1999): 371–377, 10.1016/s0168-9525(99)01776-x. [DOI] [PubMed] [Google Scholar]
- 65. Lamb T. D., Collin S. P., and Pugh E. N., “Evolution of the Vertebrate Eye: Opsins, Photoreceptors, Retina and Eye Cup,” Nature Reviews Neuroscience 8, no. 12 (2007): 960–976, 10.1038/nrn2283. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Goodfellow I. J., Mirza M., Xiao D., Courville A., and Bengio Y., “An Empirical Investigation of Catastrophic Forgetting in Gradient‐Based Neural Networks,” in International Conference on Learning Representations , (2014), 10.48550/arxiv.1312.6211. [DOI]
- 67. Kirkpatrick J., Pascanu R., Rabinowitz N., et al., “Overcoming Catastrophic Forgetting in Neural Networks,” Proceedings of the National Academy of Sciences of the United States of America 114, no. 13 (2017): 3521–3526, 10.1073/pnas.1611835114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68. Ven G. M. v. d., Tuytelaars T., and Tolias A. S., “Three Types of Incremental Learning,” Nature Machine Intelligence 4, no. 12 (2022): 1185–1197, 10.1038/s42256-022-00568-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69. Regal P. J., “The Evolutionary Origin of Feathers,” The Quarterly Review of Biology 50, no. 1 (1975): 35–66, 10.1086/408299. [DOI] [PubMed] [Google Scholar]
- 70. Prum R. O., “Development and Evolutionary Origin of Feathers,” Journal of Experimental Zoology 285, no. 4 (1999): 291–306, . [DOI] [PubMed] [Google Scholar]
- 71. Yu M., Wu P., Widelitz R. B., and Chuong C.‐M., “The Morphogenesis of Feathers,” Nature 420, no. 6913 (2002): 308–312, 10.1038/nature01196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72. Harris M. P., Fallon J. F., and Prum R. O., “Shh‐Bmp2 Signaling Module and the Evolutionary Origin and Diversification of Feathers,” Journal of Experimental Zoology 294, no. 2 (2002): 160–176, 10.1002/jez.10157. [DOI] [PubMed] [Google Scholar]
- 73. Ksepka D. T., “Feathered Dinosaurs,” Current Biology 30, no. 22 (2020): R1347–R1353, 10.1016/j.cub.2020.10.007. [DOI] [PubMed] [Google Scholar]
- 74. Xing L., Cockx P., McKellar R. C., and O'Connor J., “Ornamental Feathers in Cretaceous Burmese Amber: Resolving the Enigma of Rachis‐Dominated Feather Structure,” Journal of Palaeogeography 7, no. 1 (2018): 13, 10.1186/s42501-018-0014-2. [DOI] [Google Scholar]
- 75. Perrichot V., Marion L., Nraudeau D., Vullo R., and Tafforeau P., “The Early Evolution of Feathers: Fossil Evidence From Cretaceous Amber of France,” Proceedings of the Royal Society B: Biological Sciences 275, no. 1639 (2008): 1197–1202, 10.1098/rspb.2008.0003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76. Tomoyasu Y., Arakane Y., Kramer K. J., and Denell R. E., “Repeated Co‐options of Exoskeleton Formation During Wing‐to‐Elytron Evolution in Beetles,” Current Biology 19, no. 24 (2009): 2057–2065, 10.1016/j.cub.2009.11.014. [DOI] [PubMed] [Google Scholar]
- 77. Goczał J. and Beutel R. G., “Beetle Elytra: Evolution, Modifications and Biological Functions,” Biology Letters 19, no. 3 (2023): 20220559, 10.1098/rsbl.2022.0559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78. Sutton R. S. and Barto A. G., Reinforcement Learning: An Introduction (The MIT Press, 2018). [Google Scholar]
- 79. Silver D., Huang A., Maddison C. J., et al., “Mastering the Game of Go With Deep Neural Networks and Tree Search,” Nature 529, no. 7587 (2016): 484–489, 10.1038/nature16961. [DOI] [PubMed] [Google Scholar]
- 80. Xie Z., Sato I., and Sugiyama M., “A Diffusion Theory for Deep Learning Dynamics: Stochastic Gradient Descent Exponentially Favors Flat Minima,” in The International Conference on Learning Representations (2021).
- 81. Zhu Z., Wu J., Yu B., Wu L., and Ma J., “The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping From Sharp Minima and Regularization Effects,” in Proceedings of the 36th International Conference on Machine Learning , Vol. 97, (2019): 7654–7663, https://proceedings.mlr.press/v97/zhu19e.html. [Google Scholar]
- 82. Goldberg D. E., Genetic Algorithms in Search, Optimization, and Machine Learning (Addison‐Wesley Longman Publishing Co., Inc, 1989). [Google Scholar]
- 83. Schoenauer M., “Evolutionary algorithms,” in Handbook of Evolutionary Thinking in the Sciences Eds. Heams T., Huneman P., Lecointre G., and Silberstein M., (Springer, 2014): 621–635, 10.1007/978-94-017-9014-7_28. [DOI] [Google Scholar]
- 84. Karras T., Laine S., Aittala M., Hellsten J., Lehtinen J., and Aila T., “Analyzing and Improving the Image Quality of StyleGAN,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020), 8107–8116, 10.1109/cvpr42600.2020.00813. [DOI]
- 85. Zhong J., Liu X., and Hsieh C.‐J., “Improving the Speed and Quality of GAN by Adversarial Training,” arXiv (2020), 10.48550/arxiv.2008.03364. [DOI] [Google Scholar]
- 86. Lässig M., Mustonen V., and Walczak A. M., “Predicting Evolution,” Nature Ecology & Evolution 1, no. 3 (2017): 77–79, 10.1038/s41559-017-0077. [DOI] [PubMed] [Google Scholar]
- 87. Nosil P., Flaxman S. M., Feder J. L., and Gompert Z., “Increasing Our Ability to Predict Contemporary Evolution,” Nature Communications 11, no. 1 (2020): 5592, 10.1038/s41467-020-19437-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88. Wortel M. T., Agashe D., Bailey S. F., et al., “Towards Evolutionary Predictions: Current Promises and Challenges,” Evolutionary Applications 16, no. 1 (2023): 3–21, 10.1111/eva.13513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89. Hu H., Uesaka M., and Guo S., et al., “Constrained Vertebrate Evolution by Pleiotropic Genes,” Nature Ecology & Evolution 1, no. 11 (2017): 1722–1730, 10.1038/s41559-017-0318-0. [DOI] [PubMed] [Google Scholar]
- 90. Uchida Y., Shigenobu S., Takeda H., Furusawa C., and Irie N., “Potential Contribution of Intrinsic Developmental Stability Toward Body Plan Conservation,” BMC Biology 20, no. 1 (2022): 82, 10.1186/s12915-022-01276-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91. Uchida Y., Takeda H., Furusawa C., and Irie N., “Stability in Gene Expression and Body‐Plan Development Leads to Evolutionary Conservation,” EvoDevo 14, no. 1 (2023): 4, 10.1186/s13227-023-00208-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92. Tong H. and Nikoloski Z., “Machine Learning Approaches for Crop Improvement: Leveraging Phenotypic and Genotypic Big Data,” Journal of Plant Physiology 257 (2021): 153354, 10.1016/j.jplph.2020.153354. [DOI] [PubMed] [Google Scholar]
- 93. Xu Y., Zhang X., Li H., et al., “Smart Breeding Driven by Big Data, Artificial Intelligence, and Integrated Genomic‐Enviromic Prediction,” Molecular Plant 15, no. 11 (2022): 1664–1695, 10.1016/j.molp.2022.09.001. [DOI] [PubMed] [Google Scholar]
- 94. Palmer A. C. and Kishony R., “Understanding, Predicting and Manipulating the Genotypic Evolution of Antibiotic Resistance,” Nature Reviews Genetics 14, no. 4 (2013): 243–248, 10.1038/nrg3351. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95. Sommer M. O. A., Munck C., Toft‐Kehler R. V., and Andersson D. I., “Prediction of Antibiotic Resistance: Time for a New Preclinical Paradigm?” Nature Reviews Microbiology 15, no. 11 (2017): 689–696, 10.1038/nrmicro.2017.75. [DOI] [PubMed] [Google Scholar]
- 96. Novakovsky G., Dexter N., Libbrecht M. W., Wasserman W. W., and Mostafavi S., “Obtaining Genetics Insights From Deep Learning via Explainable Artificial Intelligence,” Nature Reviews Genetics 24, no. 2 (2023): 125–137, 10.1038/s41576-022-00532-2. [DOI] [PubMed] [Google Scholar]
- 97. Karniadakis G. E., Kevrekidis I. G., Lu L., Perdikaris P., Wang S., and Yang L., “Physics‐Informed Machine Learning,” Nature Reviews Physics 3, no. 6 (2021): 422–440, 10.1038/s42254-021-00314-5. [DOI] [Google Scholar]
- 98. Ghassemi M., Oakden‐Rayner L., and Beam A. L., “The False Hope of Current Approaches to Explainable Artificial Intelligence in Health Care,” The Lancet Digital Health 3, no. 11 (2021): e745–e750, 10.1016/s2589-7500(21)00208-9. [DOI] [PubMed] [Google Scholar]
- 99. Vokinger K. N., Feuerriegel S., and Kesselheim A. S., “Mitigating Bias in Machine Learning for Medicine,” Communications Medicine 1, no. 1 (2021): 25, 10.1038/s43856-021-00028-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100. Yang J. H., Wright S. N., Hamblin M., et al., “A White‐Box Machine Learning Approach for Revealing Antibiotic Mechanisms of Action,” Cell 177, no. 6 (2019): 1649–1661, 10.1016/j.cell.2019.04.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101. Elmarakeby H. A., Hwang J., Arafeh R., et al., “Biologically Informed Deep Neural Network for Prostate Cancer Discovery,” Nature 598, no. 7880 (2021): 348–352, 10.1038/s41586-021-03922-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102. Lotfollahi M., Rybakov S., and Hrovatin K., et al., “Biologically Informed Deep Learning to Query Gene Programs in Single‐Cell Atlases,” Nature Cell Biology 25 (2023): 337–350, 10.1038/s41556-022-01072-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103. Hartman E., Scott A. M., Karlsson C., et al., “Interpreting Biologically Informed Neural Networks for Enhanced Proteomic Biomarker Discovery and Pathway Analysis,” Nature Communications 14, no. 1 (2023): 5359, 10.1038/s41467-023-41146-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104. Doncevic D. and Herrmann C., “Biologically Informed Variational Autoencoders Allow Predictive Modeling of Genetic and Drug‐Induced Perturbations,” Bioinformatics 39, no. 6 (2023): btad387, 10.1093/bioinformatics/btad387. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105. Prosz A., Pipek O., Börcsök J., et al., “Biologically Informed Deep Learning for Explainable Epigenetic Clocks,” Scientific Reports 14, no. 1 (2024): 1306, 10.1038/s41598-023-50495-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data sharing is not applicable to this article, as no new data were created or analyzed in this study.
