Abstract
Organismal phenotypes result largely from inherited developmental programs, usually executed during embryonic and juvenile life stages. These programs are not blank slates onto which natural selection can draw arbitrary forms. Rather, the mechanisms of development play an integral role in shaping phenotypic diversity and help determine the evolutionary trajectories of species. Modern evolutionary biology must, therefore, account for these mechanisms in both theory and in practice. The gene regulatory network (GRN) concept represents a potent tool for achieving this goal whose utility has grown in tandem with advances in “omic” technologies and experimental techniques. However, while the GRN concept is widely utilized, it is often less clear what practical implications it has for conducting research in evolutionary developmental biology. In this Perspective, we attempt to provide clarity by discussing how experiments and projects can be designed in light of the GRN concept. We first map familiar biological notions onto the more abstract components of GRN models. We then review how diverse functional genomic approaches can be directed toward the goal of constructing such models and discuss current methods for functionally testing evolutionary hypotheses that arise from them. Finally, we show how the major steps of GRN model construction and experimental validation suggest generalizable workflows that can serve as a scaffold for project design. Taken together, the practical implications that we draw from the GRN concept provide a set of guideposts for studies aiming at unraveling the molecular basis of phenotypic diversity.
Keywords: cis-regulatory element, CRISPR, evolutionary developmental biology, gene expression, gene regulatory networks
1 |. INTRODUCTION
The modern synthesis revolutionized our understanding of the ultimate causes of phenotypic evolution. Its pioneers accomplished this by producing a rigorous framework, population genetics, that facilitated the modeling of evolutionary forces (e.g., natural selection and genetic drift), and their effects on the genetic composition of populations over time (Pigliucci & Müller, 2010). However, while population genetics represents one of the crowning achievements of evolutionary biology and one of its most active areas of research, it does not provide a complete explanation of evolution, as it does not address more proximate causes. Chief among these proximate causes are the inherited developmental programs that transform single-celled embryos into adult organisms (Futuyma, 2017). By relating genotypic changes with their phenotypic consequences, developmental programs play a central role in defining the boundaries within which selection can drive phenotypic change, thereby playing a profound role in shaping the evolutionary trajectories of species. Moreover, evolutionary phenomena, such as epistasis, canalization, plasticity, and polyphenism arise directly from developmental processes (Fusco & Minelli, 2010; Phillips, 2008). Therefore, accounting for the molecular mechanisms that control development is essential to the construction of a more comprehensive and explanatory evolutionary theory. Accomplishing this task is a fundamental goal of evolutionary developmental biology (or EvoDevo).
Despite the explosion of EvoDevo research over recent decades, this rapidly-maturing field lacks a single framework equivalent to population genetics in terms of its capacity to model relevant evolutionary processes or in its universal acceptance among researchers. To an extent, this absence may be a structural feature of EvoDevo, which considers phenomena at multiple levels of biological organization. However, where practical, establishing shared concepts, language, and analytical tools can promote progress within a field, thus making their pursuit worthwhile.
One concept which has shown considerable value in EvoDevo is that of gene regulatory networks (GRNs). The central contention of the GRN concept is that the molecular structure of developmental programs is, as the name suggests, fundamentally network-like (Levine & Davidson, 2005). More specifically, biological processes are built from genetically-encoded components, linked by a reticulated and often recursive web of regulatory interactions. The GRN concept has gained widespread application in EvoDevo, both as an informal guiding principle for interpreting biological data and more formally through attempts to produce explicit network models of developmental programs (Cao et al., 2019; Hughes et al., 2021; Liu, Ramos-Womack, et al., 2019; Martik et al., 2019; McMillan et al., 2020; Paolino et al., 2020; Sadier et al., 2020; Verd et al., 2019). Its utility has grown in parallel with the explosion of “omic” techniques, many of which are applicable to phenotypically diverse, nontraditional model organisms. Thus, as EvoDevo marches into the future, there is considerable value in taking stock of the experimental toolkit available to researchers studying the evolution of development, and how it can be applied in light of the GRN concept.
Rather than focusing on the broad array of particular questions in EvoDevo that can be clarified through the lens of GRNs as has been eloquently done elsewhere (Erwin, 2020; Hatleberg & Hinman, 2021; McQueen & Rebeiz, 2020), our goal in this Perspective piece is instead to demonstrate how biologists setting out with a fundamental question and suitable model system in mind can structure their research projects and constituent experiments around the process of GRN model constrcruction. To this end, we will begin by laying out familiar ideas about development and evolution, and how they can be related to GRNs. Next, we will give a broad overview of how both established and emerging “omic” techniques can be used to construct GRN models for developmental processes of interest, with the ultimate goal of generating hypotheses about their function in vivo and their evolution through comparative studies. We will then discuss advances in functional experiments that empower the testing of hypotheses arising from such studies. Finally, using a hypothetical example, we show how the process of constructing GRN models and testing hypotheses that arise from them, inherently suggest generalizable workflows that can serve as a guiding principle for EvoDevo research projects.
2 |. UNDERSTANDING DEVELOPMENT AND EVOLUTION IN TERMS OF GRNs
Developmental programs comprise the sets of stepwise changes in cells, tissues, and organs that ultimately produce phenotypes. The developmental program of a given phenotype is generally controlled by one or more GRNs. GRNs are composed of genes and their expressed products (proteins and noncoding RNAs), whose molecular blueprints and regulatory circuitry are encoded in the genome. The activities of these gene products form signaling pathways that govern cellular differentiation, tissue growth and organogenesis. As implied by the word “pathway”, signaling pathways have a degree of directionality; a flow of regulatory information (Azeloglu & Iyengar, 2015). Phenotypic differences between species in turn arise from fixed changes in the genome that alter the flow of regulatory information through signaling pathways or by the redeployment of such pathways in new developmental contexts (Glassford et al., 2015; McQueen & Rebeiz, 2020). Given these attributes, two key ways that developmental programs can evolve are (i) through changes in gene expression and (ii) changes in gene interactions. When attempting to model GRNs, their constituent genes can be represented by “nodes” in a network graph and the molecular interactions between genes (often mediated by noncoding regulatory regions) can be represented by network connections, or “edges” (Figure 1) (Barabási & Oltvai, 2004). We can, thus, readily map familiar notions about the evolution of GRNs to the structure of a GRN model, with evolution being represented by changes in (i) their node composition and (ii) their connectivity through edges.
FIGURE 1.

Gene regulatory network (GRN) models. In a GRN model, network nodes represent genes or gene products, and edges represent regulatory interactions between them. These interactions can drive upregulation (pointed arrow heads) or downregulation (flat arrow heads). In a biological network, a single edge typically encodes two types of regulatory information: inputs to and outputs from an underlying cis-regulatory element. Inputs come in the form of transcription factor binding from an upstream “source” node, while outputs are represented by cis-regulatory activity on a downstream “target” node (i.e., enhancer looping and its impact on transcription)
EvoDevo researchers typically start their projects with one or a few general questions in mind and a model system well-suited to address these questions (usually a species possessing a phenotype of interest). A logical starting point for such research is to dissect the underlying developmental program for a phenotype of interest and attempt to infer the biological interactions of its constituent genes and regulatory elements. This information provides a starting point for building specific hypotheses about gene function that can be tested through experiments. Purposive adoption of the GRN framework as a guiding principle in such work can have implications for experimental design. Therefore, we will next consider how “omic” techniques can be applied to the construction of GRN models for developmental programs of interest.
3 |. PREDICTING GRNs FROM TRANSCRIPTOMES
Transcriptomics is perhaps the most fundamental way to gain insights into the structure of a developmental GRN and to produce initial models. As the cost of high-throughput sequencing has come down, RNA sequencing (RNA-Seq) has become the workhorse approach to study gene expression across whole transcriptomes, supplanting lower-throughput and often less consistent methods like quantitative polymerase chain reaction. In their most common iteration, RNA-Seq experiments are designed to facilitate differential gene expression (DGE) analyses: pairwise comparisons of normalized transcript abundance between two sample replicate groups. Working on the assumption that significant differences in gene expression between groups correspond to biologically-relevant differences in functional output, DGE can be an effective way to flag genes involved in the distinct developmental program of a phenotype of interest. For example, differential and spatially-patterned expression of Alx3 has been linked with the development of periodic dark and light dorsal stripes in the African striped mouse (Rhabdomys pumilio) (Mallarino et al., 2016). Because Alx3 is a homeobox transcription factor, encoded by a member of a gene family with extensive roles in developmental patterning processes, its identification through DGE was able to point researchers toward differentially expressed pigmentation genes that are candidates to be Alx3’s regulatory targets. Differentially expressed genes like Alx3, thus, can act as a starting point for establishing a dorsal stripe patterning GRN model. Within such a model, Alx3 might represent a source node, as regulatory information flows from a transcription factor to its downstream regulatory targets (which may themselves be viewed as target nodes). Alternatively, differential expression of Alx3 between light and dark stripes is itself the product of some upstream regulatory information, so Alx3 may also be viewed as a target node.
Part of the attraction of DGE is its ability to easily accommodate a variety of experimental designs, including comparisons between different tissues within an organism, between two or more groups of a single tissue exposed to different experimental treatments, or within a tissue across developmental time (Figure 2a–c). The ubiquity of DGE analyses in the field has led to the rapid proliferation and maturation of computational tools to conduct these analyses, with popular and well-tested packages such as DESeq2 and EdgeR (Love et al., 2014; Robinson et al., 2010). While the utility of DGE analyses is considerable, they possess certain limitations. For example, DGE can be challenging when performing gene expression comparisons across species (Dunn et al., 2018). Differences in genome assembly and annotation quality can influence quantitation of gene expression between species, and the biological impacts of gene expression levels themselves can change on diverging genomic backgrounds, potentially confounding direct comparisons of gene expression levels, even if they have been normalized. Moreover, developmental programs often involve continuous and/or nonlinear changes in gene expression over time. DGE analyses, which require replicate groups at discrete timepoints, may struggle to capture such gene expression patterns. This complication can be especially pronounced for those studying development in nontraditional model organisms, where limited samples and complex or irregular breeding cycles can make collection of true biological replicate groups challenging or unfeasible. In such cases, comparisons of gene expression dynamics (i.e., the pattern of relative gene expression within each species over time) may be a viable approach. For example, segmented regression models can be used to extract biologically meaningful expression changes from time course data that lack discrete replicate groups and to capture nonlinear changes in gene expression (Figure 2d) (Bacher et al., 2018).
FIGURE 2.

Transcriptomics and GRN inference. DGE analyses can be used to compare expression (a) between tissues within an organism as the light and dark dorsal stripes of the African striped mouse; (b) within a tissue under different treatment regimes; and (c) across timepoints during a developmental process. Alternatives to DGE that better-accommodate longitudinal data without discrete replicate groups include (d) segmented regression or (e) correlation network analyses, such as WGCNA, which directly predicts networks from transcriptomes. Here, a heatmap shows module eigengene activity per sample, with samples sorted by developmental stage. Thus, the highlighted network module represents a group of highly-correlated genes whose expression is highest early in development. DGE, differential gene expression; GRN, gene regulatory network; vs, versus; WGCNA, weighted gene coexpression network analysis
Weighted gene correlation network analysis (WGCNA) (Langfelder & Horvath, 2008) provides another alternative to DGE that can accommodate complex gene expression patterns and which can be used to directly construct preliminary GRN models. WGCNA leverages expression correlations between genes across samples to construct network modules: groups of genes with correlated expression levels (Figure 2e). Many of these correlations are driven by molecular interactions or by participation in a common biological process. Thus, WGCNA modules derived from appropriate samples can act as a reasonable starting point for defining both nodes and edges in a GRN model. WGCNA does not require discrete replicate groups, making it flexible and amenable to challenging model systems. Moreover, while module construction is performed agnostic of sample information, such data can subsequently be compared to modules to find associations with their activity across samples. To this end, WGCNA provides a metric, module eigengene expression, that summarizes the relative expression level of genes within a given module across samples. This metric can be correlated with data such as age, weight, morphometrics, treatment regime, to find associations between GRN activity and key events during development. Module eigengene can also be used to study the internal structure of predicted network modules, by comparing it against the expression levels of its constituent genes. Strong correlations between gene expression and the module eigengene level are an indication of high intramodular connectivity, which is often considered to indicate “hub gene” status and suggestive of important roles in underlying signaling pathways captured by the inferred network module (Liu, Gu, et al., 2019).
In traditional bulk RNA-Seq experiments, samples may include complex assemblages of tissues and cell types. As such, correlation modules may not only reflect intracellular interactions (Casasa et al., 2020). Signaling pathways are often distributed across cell types, particularly those governing coordinated tissue growth. Thus, this aspect of correlation network analysis can be a powerful asset in identifying developmentally-relevant pathways. However, it can also be a double-edged sword, by flagging potentially spurious or incidental correlations as significant. As such, correlation modules should be supplemented with experiments that establish the localization of putative GRN components (e.g., via spatial localization of genes comprising network nodes) and validity of predicted interactions (e.g., via functional experiments; discussed below).
The widespread adoption of single-cell RNA-Seq (scRNA-Seq) provides another avenue for network prediction. Standard applications for scRNA-Seq include identification of distinct cell types in complex organs (Abdelaal et al., 2019), DGE analyses between cell cluster or across tissues (T. Wang et al., 2019), and developmental trajectories that plot changes in gene expression and cell identity against pseudotime (Chen et al., 2019). scRNA-Seq is also amenable to correlation network analyses (Cha & Lee, 2020) as well as alternative approaches for GRN inference (Nguyen et al., 2020).
Careful consideration of experimental design and intended analysis approaches can make transcriptomics an effective way to begin constructing GRN models for developmental programs of interest. However, transcriptomic analyses only consider gene expression levels, they rarely provide significant information about the nature of predicted interactions. Moreover, the evolution of development is often driven by changes in noncoding, regulatory DNA (Wittkopp & Kalay, 2012). Thus, additional layers of functional and comparative genomic data can be used to further refine GRN models and to home in on the causative nucleotide changes behind phenotypic adaptation.
4 |. REFINING GRN EDGES
While network edges derived from gene expression correlations can reflect gene interactions, they are technically predicted in a manner that is agnostic of underlying molecular mechanisms. Thus, they may capture a direct interaction such as the binding of a transcription factor to the promoter of a target gene, they may reflect several degrees of separation within a signaling pathway via multiple intermediary genes, or they can represent spurious correlations driven by the incidental co-occurrence of unrelated biological processes. Indeed, all expressed genes may be connected by edges in a correlation network, so long as the strength of their expression correlations are above some threshold level. Furthermore, while network hub genes may be strong candidates for key regulatory roles, expression correlations can rarely establish the directionality of interactions on their own. More biologically-realistic and informative GRN models can be produced by pruning away or masking indirect or spurious edges and by establishing the directionality of those which remain. This process of edge refinement can help to reveal the flow of regulatory information through the underlying signaling pathways that comprise developmental GRNs and generate specific hypotheses about their biological function that are testable through experiments.
Because gene regulation is controlled to a significant degree by noncoding cis-regulatory elements (CREs), these elements are key targets for functional assays aimed at network edge refinement. CREs are composed of clustered transcription factor binding sites and drive expression of often-distant target genes through three-dimensional looping interactions (Schoenfelder & Fraser, 2019). CREs, thus, serve as the mediators for the gene interactions that constitute network edges, linking source nodes with target nodes. Multiple, layers of epigenomic information influence CRE activity and, importantly, mark these elements in ways that assist in their characterization.
To use cis-regulatory information to refine GRN models, the first and most basic goal is CRE discovery. The second and more complex goal is to identify the numerous, specific interactions mediated by identified CREs. These data can then be used to prune or mask indirect network edges (foregrounding direction interactions) and to define edge attributes that represent relevant details of edge interactions such as their direction and magnitude. In practice, CRE discovery and characterization may represent separate steps, or may be outcomes of a single experiment. Therefore, we will next briefly summarize how different families of functional and comparative genomic techniques can be used to identify CREs, infer their regulatory inputs and outputs, and how this information can be used to refine edges in network models.
4.1 |. Identifying CREs
CREs are often difficult to identify from genomic sequence alone, owing in part to the absence of a universal CRE grammar comparable to the codon-amino acid grammar of proteins-coding genes. Moreover, while many enhancers produce transcripts called eRNAs (Tippens et al., 2018), recent studies have identified eRNA transcripts for only 40,000–65,000 of the more than 400,000 candidate human enhancers identified by ENCODE (Sartorelli & Lauberth, 2020). Thus, while transcription may be functionally important for many CREs, it is not a universal or diagnostic feature of such elements. In practice, CREs are most often discovered in the vast sea of noncoding DNA based on three main characteristics: (i) distinct chemical modification of surrounding histones, (ii) elevated chromatin accessibility compared to the genomic background, and (iii) enhanced evolutionary sequence conservation.
Much as RNA-Seq has become the workhorse technique in transcriptomic studies, chromatin immunoprecipitation sequencing (ChIP-Seq) has been widely used for CRE discovery and characterization (Douglas et al., 2014; Johnson et al., 2007; Roy et al., 2010; Visel et al., 2009). This technique works by cross-linking proteins bound to DNA, fragmenting the genome, and immunoprecipitating proteins of interest using a specific antibody. Bound DNA is then released, sequenced, mapped back to the genome, and quantified. Regions enriched for mapped reads (representing DNA bound by the protein of interest) are then called as “peaks” and filtered using various computational tools (Gaspar, 2018; Q. Li et al., 2011). Most often, characteristic chemical modifications of surrounding histones are used to identify different classes of CREs or define their state. Perhaps the most widely used is acetylation of histone 3 lysine 27 (H3K27ac for short), a mark for both active promoters and enhancers. Other marks include H3K27me3 which indicates repressed promoters and H3K4me1, which marks enhancers independent of their activity. More recent methods such as cleavage under targets & release using nuclease (CUT&RUN) and cleavage under targets & tagmentation provide improve signal-to-noise ratios and require lower sample inputs than traditional ChIP-Seq, greatly improving CRE discovery (Kaya-Okur et al., 2019; Skene & Henikoff, 2017). While applicable to diverse model species and relatively specific, histone marks can be deposited in broad regions around active elements, and do not always perfectly correspond with a given functional class of noncoding elements. Thus, combination with other approaches can improve CRE inferences from chemical modifications.
In order for CREs to function, they must be accessible to DNA binding proteins such as transcription factors. This can in turn be used to distinguish CREs from the genomic background. In most chromatin accessibility assays, the genome is first digested by a nuclease (DNase-Seq and micrococcal nuclease-Seq) or a transposase (ATAC-Seq) (Buenrostro et al., 2015; Schones et al., 2008; Song & Crawford, 2010). Open chromatin is readily cut by such enzymes. However, closed chromatin sterically hinders digestion, leading to biases in cutting frequency, and, in turn, library composition that can be analyzed in a manner similar to ChIP-Seq. Variants of ATAC-Seq in particular have gained widespread use due to their low sample input requirements, compatibility with snap-frozen samples and their simple library preparation (which combines the addition of adaptor sequences with fragmentation via transposase-mediated “tagmentation”) (Corces et al., 2017). However, open chromatin is also a feature of actively expressed genes and numerous other classes of genetic elements. Thus, care must be taken to filter coding regions appropriately and to further validate cis-regulatory activity of peaks of interest using approaches such as reporter assays.
Comparative genomics can provide an alternative approach to CRE discovery that is often cost-effective and applicable to species for which collecting tissue for epigenomics and RNA-Seq is challenging. DNA elements with critical functions experience elevated purifying selection, and therefore show a higher degree of sequence conservation compared to the genomic background (Siepel et al., 2005). By performing whole-genome alignments and deriving an appropriate model of background evolutionary rates, candidate CREs can be detected based on their multispecies conservation using one of several available software suites (Cooper et al., 2005; Davydov et al., 2010; Pollard et al., 2010; Siepel et al., 2005). CREs emerge from these analyses as conserved noncoding elements, or conserved noncoding elements (CNEs). Additionally, precomputed conservation tracks for many combinations of species are available through public resources such as UCSC Genome Browser and can be lifted over to a species of interest through whole genome alignment (Raney et al., 2014).
A key strength of sequence conservation analyses is that once a reference genome is available, they can often be performed without the expenditure of additional biological samples (a considerable concern for many nontraditional model species) or additional costs beyond computational time. Indeed, such approaches can be applied to historical or ancient genomes, for which histone marks and chromatin accessibility data are likely unfeasible to generate (Feigin et al., 2019). CRE discovery based on sequence conservation does have significant limitations, however. Much like chromatin accessibility, sequence conservation is a common feature of many functional genetic elements and not specific to CREs (Pollard et al., 2010). As in the case of ATAC-Seq, it is critical to filter as many non-CRE elements and functionally validate CRE activity wherever possible. Conversely, not all CREs are evolutionarily conserved and there is considerable evolutionary turnover both of binding sites across active enhancers and of enhancers themselves (Domené et al., 2013). Approaches that rely on evolutionary conservation are, thus, of little utility for studying novel or species-specific CREs.
4.2 |. Inferring regulatory inputs
While CRE discovery is a necessary step in the process of network edge refinement, it does not inherently reveal upstream regulatory inputs to a CRE that define source nodes or the regulatory outputs from a CRE onto downstream genes represented by target nodes. However, many techniques are available to accomplish this, some of which even overlap with those used for CRE discovery.
Like histones, DNA-bound transcription factors also produce signatures that can be leveraged in their detection. Techniques such as ChIP-Seq/CUT&RUN, discussed above, can also be used to detect all noncoding regions bound by a given transcription factor (Jiang & Mortazavi, 2018). A good use-case for transcription factor ChIP-Seq/CUT&RUN would include situations where a given transcription factor emerges as a strong candidate gene (González-Blas et al., 2020). GRNs invariably involve a rich set of molecular interactions that are not reducible to the activity of single master-regulatory transcription factors. While antibody-based analyses for inferring regulatory inputs to CREs are powerful and precise, they can be expensive to perform on many candidate transcription factors. Moreover, in newly-established or nontraditional model systems, little prior information may be available to guide a researcher toward candidate genes.
Transcription factor footprinting provides an alternative to transcription factor ChIP-Seq that allows, in principle, the binding sites of all transcription factors in a given tissue to be predicted simultaneously with CRE discovery. Footprinting relies on the fact that, like histones, bound transcription factors are able to sterically-hinder DNA cutting enzymes. By reducing the efficiency of cutting in the DNA they shield, regions bound by transcription factors appear as a small dip (the “footprint”) in the open chromatin signal produced in assays like ATAC-Seq. DNA sequences covered by such footprints can then be compared against statistical models of transcription factor binding motifs in the form of position weight matrices (PWMs) using computational scans. Ideally, the library of transcription factor motifs used for footprinting will be species-specific (though such sets are rarely available for nontraditional model species) and will be filtered to reflect only transcription factors shown to be expressed in the tissue of interest by RNA-Seq. Because both a given bound transcription factor and the gene regulated by a bound CRE can be represented by nodes (target and source, respectively) in a GRN model, footprinting and motif scans can be a powerful tool to foreground direct edges in a correlation network constructed from transcriptomic data and to determine their directionality. For example, by combining ATAC-Seq, footprinting, and motif scans, Gehrke et al. (2019) found that the transcription factor egr binds to a variety of CREs and plays a major role in controlling regeneration in the platyhelminth worm Hofstenia. Footprinting can even be used to construct GRN models based on binding site predictions in CREs near other transcription factors or their target genes (Bentsen et al., 2020).
While the primary tradeoff for such an approach is the loss of some confidence in the identity of a bound transcription factor (computational scans can only provide probabilities for one transcription factor vs another bound under a footprint), the key advantage of this method is that it is not limited to a single transcription factor. Indeed, the only limiting factor in the number of candidate transcription factor binding sites that can be predicted is the availability of empirically-defined, lineage-appropriate PWMs for all transcription factors of interest. Public repositories such as JASPAR provide a rapidly-growing database of curated motifs for conserved transcription factors for a variety of lineages (e.g., the 2020 release includes motif models for vertebrates, nematodes, insects, plants, fungi, and urochordates) (Fornes et al., 2019). Moreover, careful curation of a motif library (e.g., filtering to only transcription factors expressed in your tissue of interest based on RNA-Seq) can narrow the range of candidate binding events to a more tractable and realistic set.
4.3 |. Determining regulatory outputs
The next key step is to establish what nodes in a network model represent targets. A common assumption is that proximity between a CRE and a given gene is sufficient to presume a likely regulatory interaction. Thus, it is assumed that a direct edge exists between some source node A and target node B, when a binding site for gene A is found in a CRE near gene B. This assumption is supported by a number of observations. First, promoters regulate the gene whose transcription start site (TSS) they precede. Second, there is a direct relationship between the linear distance in base pairs between a CRE-gene pair and the likelihood that they form chromatin contacts (an indication of regulatory interactions). It is, therefore, common to assign peaks to the nearest TSS within a maximum distance cutoff (Yu et al., 2015), or to predefine gene regions (McLean et al., 2010). While not unreasonable, this broad assumption does not rely on any direct empirical support for any given interaction.
Chromosome conformation capture assays (3C methods), designed to survey the spatial organization of chromatin in a cell, have emerged as a powerful approach to infer the target nodes of CREs and can, therefore, be used to confirm highly-supported network edges. 3C methods identify long-range chromatin interactions by leveraging looping interactions between distant enhancers and the genes they regulate. Loops are captured by crosslinking native chromatin, fixing them in place through protein–protein contacts. Chromatin is then digested with a cutting enzyme, with protein-bound DNA being sterically protected, leaving short DNA ends exposed from both the enhancer and promoter regions. These ends are then ligated, and chimeric fragments are released and sequenced. Mapped chimeric reads can then be used to find distant regions of enriched contact.
The main difference between the various 3C methods available is their scope. For example, 3C quantifies interactions between a single pair of genomic loci and can be used to test a candidate promoter-enhancer interaction (i.e., one-vs.-one) (Dekker et al., 2002). Hi-C, in contrast, allows probing all interactions at a genome-wide level (i.e., all-vs.-all) (Belton et al., 2012). Low-resolution Hi-C maps can be used to identify topologically interacting domains, within which enhancer–gene interactions are expected to occur, and between which they are not. Improved techniques such as Micro-C (Hsieh et al., 2015) provide high-resolution, sufficient for calling individual enhancer–gene loops while having lower input requirements, allowing for high-confidence in defining network edges even in challenging model systems where little tissue is available. Chromatin capture methods can also be paired with chromatin immunoprecipitation (e.g., chromatin interaction analysis with paired-end tag (G. Li et al., 2014) and HiChIP (Mumbach et al., 2016)), reducing noise, improving loop calls, and calling enhancer locations simultaneously with loops.
Having established factors bound to CREs and their probable downstream targets, it is now possible to refine (or indeed to predict) network edges. Correlation network edges predicted from WGCNA can now be given directionality or, where spurious, can be masked from a network model. Patterns of differential expression of a candidate gene may be explained in terms of another upstream factor. These refinements provide the basis for both specific hypotheses about the structure of a GRN and for inferences about the evolution of these programs through functional assays and through comparisons with “omic” data from other related species.
5 |. USING GRN MODELS TO STUDY ADAPTIVE EVOLUTION
Developmental programs are composed of GRNs, whose components are encoded in the genome and are, thus, inherited. Phenotypic evolution arises through evolutionary modification of ancestral GRNs. Therefore, any project aiming to understand the origins of an adaptive phenotype must ultimately identify causative nucleotide changes that altered its developmental program relative to an ancestral state and define their specific impacts. Because GRN models encapsulate the key interactions that govern the formation of phenotypes, they are powerful tools for studying the evolution of developmental programs. We, thus, will briefly suggest ways in which data from across species or between anatomical structures can be contextualized in light of a refined network model through comparative analyses.
Many adaptive changes in phenotype are driven by altered gene expression. Comparisons of gene expression between species therefore are an intuitive way to relate functional molecular differences between species to an established GRN. WGCNA can be used to great effect in comparative evolutionary studies between species with readily-identifiable orthologous genes. Comparing the module membership of orthologs in homologous tissues across species, or the module eigengene of equivalent modules can highlight changes in GRN structure and function between related species that differ in phenotype. For example, WGCNA network modules have been used to predict GRNs associated with mouth part polyphenisms in Pristionchus pacifica (which possesses both microbivorous and predatory morphs) through comparisons with Caenorhabditis elegans, which possesses the ancestral (microbivorous) mouthparts and lacks the derived polyphenism switch mechanism of its relative (Casasa et al., 2020). Care must be taken in studies that use extant outgroups as proxies for ancestral states. Systems drift can result in preservation of ancestral phenotypes while their underlying developmental-genetic basis continues to evolve and diverge (Ewe et al., 2020). This challenge is minimized when working with closely-related species.
With a highly-refined GRN model that incorporates information derived from CREs, it is possible to go further and explore how changes in specific transcription factor binding sites may explain expression differences and consequently mediate their impacts on phenotype. Tools such as TFforge implement computational methods for forward genomic inference of transcription factor binding sites involved in phenotypic differences based on patterns of evolutionary sequence divergence (Langer & Hiller, 2018). This approach has been used to demonstrate widespread, convergent degradation of cone-rod homeobox (Crx) binding sites across multiple subterranean mammal species (Langer & Hiller, 2018). In the context of a refined GRN model, candidate motifs may be selected, for instance, from among hub gene transcription factors.
When studying intraspecific variation, or phenotypic diversity in closely related species, quantitative genetic mapping provides a critical and powerful tool to associate specific loci or genomic regions with phenotypes of interest. Such approaches have been used to identify, among others, loci related to pigmentation and tailfin morphology within the fighting fish Betta splendens (L. Wang et al., 2021). Comparative genomics in the context of a GRN model provides additional avenues to study the evolution of development. Positive selection is one potential signature of adaptive evolution, and can be detected as an increased rate of substitutions relative to orthologous elements across species. Shifts in evolutionary rates of CREs mediating key network edges can therefore direct one’s attention to GRN components that may contain causative changes, such as alterations in transcription factor binding sites. Multiple packages implement comparative rates tests such as Phast, RERConverge, and PhyloACC (Hu et al., 2019; Hubisz et al., 2011; Kowalczyk et al., 2019). As an example, a study aimed at teasing apart the molecular interactions which lead to the development of the bat wing, employed comparative genomics coupled with epigenomics, to first identify CNEs, then to detect bat accelerated regions (BARs), representing candidate CREs under positive selection. Through comparative computational scans between bats and nonflying placental mammals, they predicted ancestral and novel transcription factor binding sites and moreover showed that some BARs showed different expression patterns to those of their mouse orthologs (Booker et al., 2016). The effects of such changes in expression can be hypothesized based on an existing GRN model and then tested functionally.
6 |. TESTING HYPOTHESES DERIVED FROM GRNs
By integrating multiple layers of functional and comparative genomic data, it is possible to construct highly informative, and predictive GRNs that model the developmental program for a phenotype of interest. Against these GRN models, testable hypotheses about the structure, function, and evolution of developmental programs can be formulated. To test such hypotheses, distinguish correlation from causation, and gain a mechanistic understanding of the processes by which GRNs operate to establish phenotypes of interest, it is crucial to incorporate functional experiments that perturb GRN topology. The field of EvoDevo has a long and rich tradition of employing a variety of experimental tools to probe gene function, several of which have been applied to dissecting GRNs in different species (Britton et al., 2020; Cary et al., 2020; Gehrke et al., 2019; Glassford et al., 2015; Mallarino et al., 2016; Nocedal et al., 2017; X.-P. Wang et al., 2007).
In recent years, CRISPR/Cas9 genome editing has revolutionized practically every field of biological/medical research. CRISPR/Cas9 technology utilizes a bacterially-derived endonuclease (Cas9) that can be directed to cut specific genomic loci through the use of a single-guide RNA, thereby co-opting highly-conserved eukaryotic DNA repair mechanisms to induce sequence edits. This technology has been used to induce a wide variety of genomic changes, including gene knock-out/knock-in, gene labeling, nucleotide substitution, gene silencing, and high-throughput screening assays that allow probing multiple genomic loci in a single experiment. In addition, recent studies have led to the discovery or engineering of numerous Cas9 variants with diverse targeting efficiencies and preferences, enabling nearly-unconstrained sequence manipulation across the genome. Because of its great precision and relative ease with which it can be applied to diverse, nontraditional species, we anticipate this technology will completely transform the field of EvoDevo. In this section, we discuss how this powerful tool can be used to probe GRNs, by directly modifying genes representing network nodes and/or targeting CREs mediating network edges.
By directly perturbing the nodes in a GRN, researchers can test whether a given gene participates in regulating a specific aspect of a phenotype and/or in controlling the expression of other genes in the network. For example, by performing CRISPR/Cas9-mediated knock outs in multiple butterfly species, Zhang et al. (2017) demonstrated that optix, a gene previously associated with differences in butterfly wing color patterns, functions as a regulator of pigmentary and structural coloration. Knocking out optix led to a marked decrease of the ommochrome pathway and increase in melanin pigments. This demonstrated that optix has a dual ability to activate and repress downstream targets and acts as a key network node controlling pigment type in different butterfly species.
Nodes representing hub genes can be highly-connected within the network. Because this may make it difficult to define specific causative interactions driving phenotypic effects, it may be preferable to disrupt specific interactions representing a single network edge. Moreover, increasing lines of evidence have supported changes to edges being a major source of phenotypic evolution (Prud’homme et al., 2007; Wray, 2007). Therefore, deciphering GRN edges is instrumental in understanding (1) the dynamics and diversity of interactions within the GRN, and (2) the evolution of GRN driven by modulation of these interactions. CRISPR/Cas9 has shifted the paradigm by its high efficiency, accuracy and flexibility of in vivo CRE editing, offering the possibility of defining clear links between regulatory regions and phenotypic traits in a wide range of species. The versatility of CRISPR/Cas9 has allowed researchers to perform experiments at a scale that would have been unimaginable only a few years ago. For example, targeted deletions of individual enhancers, as well as compound deletions of multiple ones, can now be performed to probe the effect of different network edges converging onto a single gene. In a recent study, Hörnblad and colleagues used this approach to dissect the regulatory regions controlling Fgf8 expression in the developing limbs and in the midbrain–hindbrain boundary (MHB) of mouse embryos. A series of genomic deletions revealed that, while several enhancers have redundant roles in controlling Fgf8 expression in limbs, expression in MHB is primarily controlled by a single enhancer (Hörnblad et al., 2021). The previous example focused on deleting entire or partial enhancer sequences, but did not reveal detailed regulatory information within a given enhancer. In a recent tour de force study, Fuqua et al. (2020) used guide RNA libraries to introduce tiling mutations along the entirety of the E3N enhancer, known to regulate the shavenbaby gene in Drosophila, and found that the regulatory activity was distributed across the whole region. Taken together, these examples illustrate how CRISPR/Cas9 technology can be applied to different systems to functionally dissect GRN edge-node connections at unprecedented mechanistic detail.
In addition to being a powerful genome editing tool, CRISPR/Cas9 can also be repurposed to alter the epigenetic status of the genome, thereby allowing researchers to activate or inhibit a putative CRE without modifying the underlying DNA sequence. In this approach, the nuclease dead Cas9 is typically fused with a transcription repressor (e.g., KRAB), which results in CRISPR interference (CRISPRi), or to an activating domain (e.g., VP64), resulting in CRISPR activation (CRISPRa). In a recent study, K. Li et al. (2020) showed that CRISPRi and CRISPRa can be used to induce modifications at specific enhancers, resulting in locus-specific epigenetic reprogramming and interference with transcription factor binding at different genomic loci. More recently, CRISPRi and CRISPRa have been coupled to inducible promoters to achieve temporal inhibition and activation of targeted CREs (Carullo et al., 2021; Hazelbaker et al., 2020; K. Li et al., 2020). While several of these techniques are still in their early phase of development, we anticipate they will be readily adapted in the field of EvoDevo because they provide precise and flexible spatio-temporal control of gene expression.
7 |. CONCLUDING REMARKS
Recent advancements in functional genomics and gene editing have greatly expanded the experimental repertoire available to researchers studying the evolution of development. The rapidity of this progress, however, can leave familiar modes of thinking and patterns of project design outmoded. Considering biological questions in light of the GRN concept can be an effective way to ground project design in a consistent framework that readily accommodates such methodological advances. Moreover, the stages of GRN construction naturally suggest potential ways of organizing research projects. In Figure 3, we provide a hypothetical example of a project to study the evolution of tail fin diversity in divergent species of killifish. This example follows each of the major steps of GRN construction outlined in this review as the skeleton for their project workflow, at each step, employing a subset of the experimental techniques discussed in each section.
FIGURE 3.

From GRN construction to project design. A hypothetical example of an EvoDevo research project applying diverse experimental approaches, organized in terms of the major steps of GRN model construction. In this example, a researcher aims to unravel the molecular basis of variable tail fin morphology between related species of killifish. They first perform RNA-Seq tissue samples spanning tail fin development in their principal model species and use it to infer correlation networks with WGCNA. A module correlated with developmental progression is identified and flagged for further refinement. The researcher then uses ATAC-Seq to identify candidate cis-regulatory elements near genes in the module and applies transcription factor footprinting together with Micro-C to confirm direct regulatory interactions and to prune away edges unsupported by empirical evidence, disconnecting some genes from the network in the process. Because genomes are available for relevant killifish species, they use comparative genomics to identify enhancers under positive selection in a species with a derived tail fin morphology. Several such enhancers contain transcription factor motifs bearing nucleotide changes predicted to impact edges in the network model. The researcher then uses CRISPR-Cas9 to alter this binding site in their principal model species and characterizes resulting phenotypes. GRN, gene regulatory network; TF, transcription factor; WGCNA, weighted gene coexpression network analysis
While the project outlined in Figure 3 represents one plausible workflow, the specific choice of experimental techniques will depend on the model system. For instance, while comparative analysis of evolutionary rates may be appropriate for studying selection between divergent species, approaches such as quantitative genetic mapping are likely to be more powerful when studying intraspecific variation (as in the case of Betta splendens) (L. Wang et al., 2021). As well, the use of CRISPR/Cas9 in vivo is contingent on adequate delivery mechanisms in the chosen species. Here, a choice is made to construct a GRN model for tail fin formation in a single species, against which other species are compared at the genomic and organismal levels. Other researchers may find it beneficial to construct GRN models for multiple species in parallel, performing comparisons at the transcriptomic and epigenomic levels as well.
A further, important consideration is the relationship between GRN model construction and functional validation. The example provided in Figure 3 idealizes the position of functional validation as the culmination of prior GRN analyses (i.e., validation experiments are motivated by the experimentally-constructed model). In practice, however, a continual dialogue between GRN model construction and functional validation should guide the progression of one’s research project. For instance, while one might proceed from constructing correlation network modules with WGCNA directly to edge refinement with epigenomics, it would be prudent to explore the potential phenotypic roles of the candidate network by manipulating the expression of predicted hub genes (particularly those which show differential expression) in a suitable system. The outcomes of such experiments can serve the dual purpose of justifying further refinements of a given network module and furnishing more specific hypotheses for further downstream functional experiments.
A central goal of evolutionary developmental biology is to define the role of proximate causes in evolution: that is, how the molecular mechanisms underlying developmental programs shape the evolutionary trajectories of species. Achieving such a broad aim requires the application of principles that transcend individual model systems. To this end, we have attempted in this Perspective to illustrate how the GRN concept can serve as a generalizable framework for experimental design and for structuring EvoDevo studies. Our discussion of the GRN concept and its practical implications provides a conceptual scaffold to those new to the field, or who seek to recontextualize their research program in light of GRNs.
ACKNOWLEDGMENTS
We thank Arnaud Martin, members of the Mallarino lab, and the reviewers for helpful comments and suggestions. In addition, we thank Elise Ireland for extensive proof reading. Charles Feigin is supported by an NIH F32 fellowship (1 F32 GM139240â€01), Sha Li is supported by a Princeton Presidential Postdoctoral Fellowship, Jorge Moreno is supported by an NSF GRFP fellowship (DGE2039656), and Ricardo Mallarino is supported by an NIH Grant (R35GM133758).
Funding information
National Institute of General Medical Sciences, Grant/Award Numbers: F32GM139240-01, R35GM133758
Footnotes
CONFLICTS OF INTEREST
The authors declare no conflicts of interest.
DATA AVAILABILITY STATEMENT
Data availability statement is not applicable.
REFERENCES
- Abdelaal T, Michielsen L, Cats D, Hoogduin D, Mei H, Reinders MJT, & Mahfouz A (2019). A comparison of automatic cell identification methods for single-cell RNA sequencing data. Genome Biology, 20(1), 194. 10.1186/s13059-019-1795-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- Azeloglu EU, & Iyengar R (2015). Signaling networks: Information flow, computation, and decision making. Cold Spring Harbor Perspectives in Biology, 7(4), a005934. 10.1101/cshperspect.a005934 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bacher R, Leng N, Chu LF, Ni Z, Thomson JA, Kendziorski C, & Stewart R (2018). Trendy: segmented regression analysis of expression dynamics in high-throughput ordered profiling experiments. BMC Bioinformatics, 19(1), 380. 10.1186/s12859-018-2405-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barabási A-L, & Oltvai ZN (2004). Network biology: Understanding the cell’s functional organization. Nature Reviews Genetics, 5(2), 101–113. 10.1038/nrg1272 [DOI] [PubMed] [Google Scholar]
- Belton JM, McCord RP, Gibcus JH, Naumova N, Zhan Y, & Dekker J (2012). Hi-C: A comprehensive technique to capture the conformation of genomes. Methods, 58(3), 268–276. 10.1016/j.ymeth.2012.05.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bentsen M, Goymann P, Schultheis H, Klee K, Petrova A, Wiegandt R, Fust A, Preussner J, Kuenne C, Braun T, Kim J, & Looso M (2020). ATAC-seq footprinting unravels kinetics of transcription factor binding during zygotic genome activation. Nature Communications, 11(1), 4267. 10.1038/s41467-020-18035-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Booker BM, Friedrich T, Mason MK, VanderMeer JE, Zhao J, Eckalbar WL, Logan M, Illing N, Pollard KS, & Ahituv N (2016). Bat accelerated regions identify a bat forelimb specific enhancer in the HoxD locus. PLoS Genetics, 12(3), 1–21. 10.1371/journal.pgen.1005738 [DOI] [PMC free article] [PubMed] [Google Scholar]
- González-Blas CB, Quan X-J, Duran-Romaña R, Taskiran II, Koldere D, Davie K, Christiaens V, Makhzami S, Hulselmans G, de Waegeneer M, Mauduit D, Poovathingal S, Aibar S, & Aerts S (2020). Identification of genomic enhancers through spatial integration of single-cell transcriptomics and epigenomics. Molecular Systems Biology, 16(5), e9438. 10.15252/msb.20209438 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Britton CS, Sorrells TR, & Johnson AD (2020). Protein-coding changes preceded cis-regulatory gains in a newly evolved transcription circuit. Science, 367(6473), 96–100. 10.1126/science.aax5217 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buenrostro J, Wu B, Chang H, & Greenleaf W (2015). ATAC-seq: A method for assaying chromatin accessibility genome-wide. Current Protocols in Molecular Biology, 109, 21.29.1–21.29.9. 10.1002/0471142727.mb2129s109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cao C, Lemaire LA, Wang W, Yoon PH, Choi YA, Parsons LR, Matese JC, Wang W, Levine M, & Chen K (2019). Comprehensive single-cell transcriptome lineages of a proto-vertebrate. Nature, 571, 349–354. 10.1038/s41586-019-1385-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carullo NVN, Hinds JE, Revanna JS, Tuscher JJ, Bauman AJ, & Day JJ (2021). A Cre-dependent CRISPR/dCas9 activation system for gene expression regulation in neurons. eNeuro, 8. Advance online publication. 10.1101/2020.11.20.391987 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cary GA, McCauley BS, Zueva O, Pattinato J, Longabaugh W, & Hinman VF (2020). Systematic comparison of sea urchin and sea star developmental gene regulatory networks explains how novelty is incorporated in early development. Nature Communications, 11(1), 6235. 10.1038/s41467-020-20023-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Casasa S, Biddle JF, Koutsovoulos GD, & Ragsdale EJ (2020). Polyphenism of a novel trait integrated rapidly evolving genes into ancestrally plastic networks. Molecular Biology and Evolution, 38(2), 331–343. 10.1093/molbev/msaa235 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cha J, & Lee I (2020). Single-cell network biology for resolving cellular heterogeneity in human diseases. Experimental & Molecular Medicine, 52(11), 1798–1808. 10.1038/s12276-020-00528-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen H, Albergante L, Hsu JY, Lareau CA, Lo Bosco G, Guan J, Zhou S, Gorban AN, Bauer DE, Aryee MJ, Langenau DM, Zinovyev A, Buenrostro JD, Yuan GC, & Pinello L (2019). Single-cell trajectories reconstruction, exploration and mapping of omics data with STREAM. Nature Communications, 10(1), 1903. 10.1038/s41467-019-09670-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cooper GM, Stone EA, Asimenos G, Green ED, Batzoglou S, & Sidow A (2005). Distribution and intensity of constraint in mammalian genomic sequence. Genome Research, 15(7), 901–913. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Corces MR, Trevino AE, Hamilton EG, Greenside PG, Sinnott-Armstrong NA, Vesuna S, Satpathy AT, Rubin AJ, Montine KS, Wu B, Kathiria A, Cho SW, Mumbach MR, Carter AC, Kasowski M, Orloff LA, Risca VI, Kundaje A, Khavari PA, … Chang HY (2017). An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues. Nature Methods, 14(10), 959–962. 10.1038/nmeth.4396 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davydov EV, Goode DL, Sirota M, Cooper GM, Sidow A, & Batzoglou S (2010). Identifying a high fraction of the human genome to be under selective constraint using GERP. PLoS Computational Biology, 6(12), e1001025. 10.1371/journal.pcbi.1001025 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dekker J, Rippe K, Dekker M, & Kleckner N (2002). Capturing chromosome conformation. Science, 295(5558), 1306–1311. 10.1126/science.1067799 [DOI] [PubMed] [Google Scholar]
- Domené S, Bumaschny VF, de Souza FSJ, Franchini LF, Nasif S, Low MJ, & Rubinstein M (2013). Enhancer turnover and conserved regulatory function in vertebrate evolution. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 368(1632), 20130027. 10.1098/rstb.2013.0027 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Douglas KC, Wang X, Jasti M, Wolff A, VandeBerg JL, Clark AG, & Samollow PB (2014). Genome-wide histone state profiling of fibroblasts from the opossum, Monodelphis domestica, identifies the first marsupial-specific imprinted gene. BMC Genomics, 15(1), 89. 10.1186/1471-2164-15-89 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dunn CW, Zapata F, Munro C, Siebert S, & Hejnol A (2018). Pairwise comparisons across species are problematic when analyzing functional genomic data. Proceedings of the National Academy of Sciences of the United States of America, 115(3), E409–E417. 10.1073/pnas.1707515115 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Erwin DH (2020). Evolutionary dynamics of gene regulation. In Peter IS (Ed.), Current topics in developmental biology (Vol. 139, Ch. 13, pp. 407–431). Academic Press. [DOI] [PubMed] [Google Scholar]
- Ewe CK, Torres Cleuren YN, & Rothman JH (2020). Evolution and developmental system drift in the endoderm gene regulatory network of caenorhabditis and other nematodes. Frontiers in Cell and Developmental Biology, 8, 170. 10.3389/fcell.2020.00170 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Feigin CY, Newton AH, & Pask AJ (2019). Widespread cis-regulatory convergence between the extinct Tasmanian tiger and gray wolf. Genome Research, 29(10), 1648–1658. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fornes O, Castro-Mondragon JA, Khan A, van der Lee R, Zhang X, Richmond PA, Modi BP, Correard S, Gheorghe M, Baranašić D, Santana-Garcia W, Tan G, Chèneby J, Ballester B, Parcy F, Sandelin A, Lenhard B, Wasserman WW, & Mathelier A (2019). JASPAR 2020: Update of the open-access database of transcription factor binding profiles. Nucleic Acids Research, 48(D1), D87–D92. 10.1093/nar/gkz1001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fuqua T, Jordan J, van Breugel ME, Halavatyi A, Tischer C, Polidoro P, Abe N, Tsai A, Mann RS, Stern DL, & Crocker J (2020). Dense and pleiotropic regulatory information in a developmental enhancer. Nature, 587(7833), 235–239. 10.1038/s41586-020-2816-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fusco G, & Minelli A (2010). Phenotypic plasticity in development and evolution: Facts and concepts. Introduction. Philosophical Transactions of the Royal Society B Biological Sciences, 365, 547–556. 10.1098/rstb.2009.0267 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Futuyma DJ (2017). Evolutionary biology today and the call for an extended synthesis. Interface Focus, 7(5), 20160145. 10.1098/rsfs.2016.0145 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gaspar JM (2018). Improved peak-calling with MACS2. bioRxiv. 10.1101/496521 [DOI] [Google Scholar]
- Gehrke AR, Neverett E, Luo Y-J, Brandt A, Ricci L, Hulett RE, Gompers A, Ruby JG, Rokhsar DS, Reddien PW, & Srivastava M (2019). Acoel genome reveals the regulatory landscape of whole-body regeneration. Science, 363, eaau6173. 10.1126/science.aau6173 [DOI] [PubMed] [Google Scholar]
- Glassford WJ, Johnson WC, Dall NR, Smith SJ, Liu Y, Boll W, Noll M, & Rebeiz M (2015). Co-option of an ancestral Hox-regulated network underlies a recently evolved morphological novelty. Developmental Cell, 34(5), 520–531. 10.1016/j.devcel.2015.08.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hatleberg WL, & Hinman VF (2021). Modularity and hierarchy in biological systems: Using gene regulatory networks to understand evolutionary change. In Gilbert SF (Ed.), Current topics in developmental biology (Vol. 141, Ch. 2, pp. 39–73). Academic Press. [DOI] [PubMed] [Google Scholar]
- Hazelbaker DZ, Beccard A, Angelini G, Mazzucato P, Messana A, Lam D, Eggan K, & Barrett LE (2020). A multiplexed gRNA piggyBac transposon system facilitates efficient induction of CRISPRi and CRISPRa in human pluripotent stem cells. Scientific Reports, 10(1), 635. 10.1038/s41598-020-57500-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hörnblad A, Bastide S, Langenfeld K, Langa F, & Spitz F (2021). Dissection of the Fgf8 regulatory landscape by in vivo CRISPR-editing reveals extensive intra- and inter-enhancer redundancy. Nature Communications, 12(1), 439. 10.1038/s41467-020-20714-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hsieh TH, Weiner A, Lajoie B, Dekker J, Friedman N, & Rando OJ (2015). Mapping nucleosome resolution chromosome folding in yeast by Micro-C. Cell, 162(1), 108–119. 10.1016/j.cell.2015.05.048 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hu Z, Sackton TB, Edwards SV, & Liu JS (2019). Bayesian detection of convergent rate changes of conserved noncoding elements on phylogenetic trees. Molecular Biology and Evolution, 36(5), 1086–1100. 10.1093/molbev/msz049 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hubisz MJ, Pollard KS, & Siepel A (2011). PHAST and RPHAST: Phylogenetic analysis with space/time models. Briefings in Bioinformatics, 12(1), 41–51. 10.1093/bib/bbq072 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hughes JT, Williams ME, Rebeiz M, & Williams TM (2021). Widespread cis- and trans-regulatory evolution underlies the origin, diversification, and loss of a sexually dimorphic fruit fly pigmentation trait. Journal of Experimental Zoology Part B: Molecular and Developmental Evolution, 1. 10.1002/jez.b.23068 [DOI] [PubMed] [Google Scholar]
- Jiang S, & Mortazavi A (2018). Integrating ChIP-seq with other functional genomics data. Briefings in Functional Genomics, 17(2), 104–115. 10.1093/bfgp/ely002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson DS, Mortazavi A, Myers RM, & Wold B (2007). Genome-wide mapping of in vivo protein-DNA interactions. Science, 316(5830), 1497–1502. [DOI] [PubMed] [Google Scholar]
- Kaya-Okur HS, Wu SJ, Codomo CA, Pledger ES, Bryson TD, Henikoff JG, Ahmad K, & Henikoff S (2019). CUT&Tag for efficient epigenomic profiling of small samples and single cells. Nature Communications, 10(1), 1930. 10.1038/s41467-019-09982-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kowalczyk A, Meyer WK, Partha R, Mao W, Clark NL, & Chikina M (2019). RERconverge: An R package for associating evolutionary rates with convergent traits. Bioinformatics, 35(22), 4815–4817. 10.1093/bioinformatics/btz468 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Langer BE, & Hiller M (2018). TFforge utilizes large-scale binding site divergence to identify transcriptional regulators involved in phenotypic differences. Nucleic Acids Research, 47(4), e19. 10.1093/nar/gky1200 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Langfelder P, & Horvath S (2008). WGCNA: An R package for weighted correlation network analysis. BMC Bioinformatics, 9, 559. 10.1186/1471-2105-9-559 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Levine M, & Davidson EH (2005). Gene regulatory networks for development. Proceedings of the National Academy of Sciences of the United States of America, 102(14), 4936–4942. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li Q, Brown JB, Huang H, & Bickel PJ (2011). Measuring reproducibility of high-throughput experiments. The Annals of Applied Statistics, 5(3), 1752–1779. [Google Scholar]
- Li G, Cai L, Chang H, Hong P, Zhou Q, Kulakova EV, Kolchanov NA, & Ruan Y (2014). Chromatin interaction analysis with paired-end tag (ChIA-PET) sequencing technology and application. BMC Genomics, 15(12), S11. 10.1186/1471-2164-15-S12-S11 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li K, Liu Y, Cao H, Zhang Y, Gu Z, Liu X, Yu A, Kaphle P, Dickerson KE, Ni M, & Xu J (2020). Interrogation of enhancer function by enhancer-targeting CRISPR epigenetic editing. Nature Communications, 11(1), 485. 10.1038/s41467-020-14362-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu Y, Gu H-Y, Zhu J, Niu Y-M, Zhang C, & Guo G-L (2019). Identification of hub genes and key pathways associated with bipolar disorder based on weighted gene co-expression network analysis. Frontiers in Physiology, 10, 1081. 10.3389/fphys.2019.01081 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu Y, Ramos-Womack M, Han C, Reilly P, Brackett KL, Rogers W, Williams TM, Andolfatto P, Stern DL, & Rebeiz M (2019). Changes throughout a genetic network mask the contribution of Hox gene evolution. Current Biology, 29(13), 2157–2166. 10.1016/j.cub.2019.05.074 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Love MI, Huber W, & Anders S (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology, 15(12), 550. 10.1186/s13059-014-0550-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mallarino R, Henegar C, Mirasierra M, Manceau M, Schradin C, Vallejo M, Beronja S, Barsh GS, & Hoekstra HE (2016). Developmental mechanisms of stripe patterns in rodents. Nature, 539(7630), 518–523. 10.1038/nature20109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Martik ML, Gandhi S, Uy BR, Gillis JA, Green SA, Simoes-Costa M, & Bronner ME (2019). Evolution of the new head by gradual acquisition of neural crest regulatory circuits. Nature, 574(7780), 675–678. 10.1038/s41586-019-1691-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McLean CY, Bristor D, Hiller M, Clarke SL, Schaar BT, Lowe CB, Wenger AM, & Bejerano G (2010). GREAT improves functional interpretation of cis-regulatory regions. Nature Biotechnology, 28(5), 495–501. 10.1038/nbt.1630 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McMillan WO, Livraghi L, Concha C, & Hanly JJ (2020). From patterning genes to process: Unraveling the gene regulatory networks that pattern heliconius wings. Frontiers in Ecology and Evolution, 8(221), Advance online publication. 10.3389/fevo.2020.00221 [DOI] [Google Scholar]
- McQueen E, & Rebeiz M (2020). On the specificity of gene regulatory networks: How does network co-option affect subsequent evolution? Current topics in developmental biology, (Vol. 139, Ch. 12, pp. 375–405). Academic Press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mumbach MR, Rubin AJ, Flynn RA, Dai C, Khavari PA, Greenleaf WJ, & Chang HY (2016). HiChIP: Efficient and sensitive analysis of protein-directed genome architecture. Nature Methods, 13(11), 919–922. 10.1038/nmeth.3999 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nguyen H, Tran D, Tran B, Pehlivan B, & Nguyen T (2020). A comprehensive survey of regulatory network inference methods using single cell RNA sequencing data. Briefings in Bioinformatics, 22(3), bbaa190. 10.1093/bib/bbaa190 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nocedal I, Mancera E, & Johnson AD (2017). Gene regulatory network plasticity predates a switch in function of a conserved transcription regulator. eLife, 6, e23250. 10.7554/eLife.23250 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Paolino A, Fenlon LR, Kozulin P, Haines E, Lim JWC, Richards LJ, & Suárez R (2020). Differential timing of a conserved transcriptional network underlies divergent cortical projection routes across mammalian brain evolution. Proceedings of the National Academy of Sciences of the United States of America, 117(19), 10554–10564. 10.1073/pnas.1922422117 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Phillips PC (2008). Epistasis—The essential role of gene interactions in the structure and evolution of genetic systems. Nature Reviews Genetics, 9(11), 855–867. 10.1038/nrg2452 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pigliucci M, & Müller G (2010). Evolution, the extended synthesis. MIT Press. [Google Scholar]
- Pollard KS, Hubisz MJ, Rosenbloom KR, & Siepel A (2010). Detection of nonneutral substitution rates on mammalian phylogenies. Genome Research, 20(1), 110–121. 10.1101/gr.097857.109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Prud’homme B, Gompel N, & Carroll SB (2007). Emerging principles of regulatory evolution. Proceedings of the National Academy of Sciences of the United States of America, 104(Suppl 1), 8605–8612. 10.1073/pnas.0700488104 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Raney BJ, Dreszer TR, Barber GP, Clawson H, Fujita PA, Wang T, Nguyen N, Paten B, Zweig AS, Karolchik D, & Kent WJ (2014). Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC genome browser. Bioinformatics, 30(7), 1003–1005. 10.1093/bioinformatics/btt637 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robinson MD, McCarthy DJ, & Smyth GK (2010). edgeR: A bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 26(1), 139–140. 10.1093/bioinformatics/btp616 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roy S, Ernst J, Kharchenko PV, Kheradpour P, Negre N, Eaton ML, Landolin JM, Bristow CA, Ma L, Lin MF, Washietl S, Arshinoff BI, Ay F, Meyer PE, Robine N, Washington NL, Di Stefano L, Berezikov E, … Kellis M, modENCODE Consortium. (2010). Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science, 330(6012), 1787–1797. 10.1126/science.1198374 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sadier A, Santana SE, & Sears KE (2020). The role of core and variable gene regulatory network modules in tooth development and evolution. Integrative and Comparative Biology, Advance online publication. 10.1093/icb/icaa116 [DOI] [PubMed] [Google Scholar]
- Sartorelli V, & Lauberth SM (2020). Enhancer RNAs are an important regulatory layer of the epigenome. Nature Structural & Molecular Biology, 27(6), 521–528. 10.1038/s41594-020-0446-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schoenfelder S, & Fraser P (2019). Long-range enhancer–promoter contacts in gene expression control. Nature Reviews Genetics, 20(8), 437–455. 10.1038/s41576-019-0128-0 [DOI] [PubMed] [Google Scholar]
- Schones DE, Cui K, Cuddapah S, Roh T-Y, Barski A, Wang Z, Wei G, & Zhao K (2008). Dynamic regulation of nucleosome positioning in the human genome. Cell, 132(5), 887–898. 10.1016/j.cell.2008.02.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, Weinstock GM, Wilson RK, Gibbs RA, Kent WJ, Miller W, & Haussler D (2005). Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Research, 15(8), 1034–1050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Skene PJ, & Henikoff S (2017). An efficient targeted nuclease strategy for high-resolution mapping of DNA binding sites. eLife, 6, e21856. 10.7554/eLife.21856 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Song L, & Crawford GE (2010). DNase-seq: A high-resolution technique for mapping active gene regulatory elements across the genome from mammalian cells. Cold Spring Harbor protocols, 2010(2), 5384. 10.1101/pdb.prot5384 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tippens ND, Vihervaara A, & Lis JT (2018). Enhancer transcription: What, where, when, and why? Genes and Development, 32(1), 1–3. 10.1101/gad.311605.118 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Verd B, Monk NAM, & Jaeger J (2019). Modularity, criticality, and evolvability of a developmental gene regulatory network. eLife, 8, e42832. 10.7554/eLife.42832 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Visel A, Blow MJ, Li Z, Zhang T, Akiyama JA, Holt A, Plajzer-Frick I, Shoukry M, Wright C, Chen F, Afzal V, Ren B, Rubin EM, & Pennacchio LA (2009). ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature, 457(7231), 854–858. 10.1038/nature07730 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang T, Li B, Nelson CE, & Nabavi S (2019). Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data. BMC Bioinformatics, 20(1), 40. 10.1186/s12859-019-2599-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang L, Sun F, Wan ZY, Ye B, Wen Y, Liu H, Yang Z, Pang H, Meng Z, Fan B, Alfiko Y, Shen Y, Bai B, Lee M, Piferrer F, Schartl M, Meyer A, & Yue GH (2021). Genomic basis of striking fin shapes and colors in the fighting fish. Molecular Biology and Evolution, 38(8), 3383–3396. 10.1093/molbev/msab110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang X-P, Suomalainen M, Felszeghy S, Zelarayan LC, Alonso MT, Plikus MV, Maas RL, Chuong CM, Schimmang T, & Thesleff I (2007). An integrated gene regulatory network controls stem cell proliferation in teeth. PLoS Biology, 5(6), e159. 10.1371/journal.pbio.0050159 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wittkopp PJ, & Kalay G (2012). Cis-regulatory elements: Molecular mechanisms and evolutionary processes underlying divergence. Nature Reviews Genetics, 13(1), 59–69. 10.1038/nrg3095 [DOI] [PubMed] [Google Scholar]
- Wray GA (2007). The evolutionary significance of cis-regulatory mutations. Nature Reviews Genetics, 8(3), 206–216. 10.1038/nrg2063 [DOI] [PubMed] [Google Scholar]
- Yu G, Wang L-G, & He Q-Y (2015). ChIPseeker: An R/bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics, 31(14), 2382–2383. 10.1093/bioinformatics/btv145 [DOI] [PubMed] [Google Scholar]
- Zhang L, Mazo-Vargas A, & Reed RD (2017). Single master regulatory gene coordinates the evolution and development of butterfly color and iridescence. Proceedings of the National Academy of Sciences of the United States of America, 114(40), 10707–10712. 10.1073/pnas.1709058114 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data availability statement is not applicable.
