Skip to main content
Molecular Biology and Evolution logoLink to Molecular Biology and Evolution
. 2025 Aug 6;42(10):msaf194. doi: 10.1093/molbev/msaf194

Flexibility in Gene Coexpression at Developmental and Evolutionary Timescales

Eva K Fischer 1,, Youngseok Song 2, Wen Zhou 3, Kim L Hoke 4
Editor: Aurelien Tellier
PMCID: PMC12486243  PMID: 40794780

Abstract

The explosion of next-generation sequencing technologies has allowed researchers to move from studying single genes to studying thousands of genes, and thereby to also consider the relationships within gene networks. Like others, we are interested in understanding how developmental and evolutionary forces shape the expression of individual genes, as well as the interactions among genes. In pursuing these questions, we confronted the central challenge that standard approaches fail to control the Type I error and/or have low power in the presence of high dimensionality (i.e. large number of genes) and small sample size, as in many gene expression studies. To overcome these challenges, we used random projection tests and correlation network comparisons to characterize differences in network connectivity and density. We detail central challenges, discuss sample size guidelines, and provide rigorous statistical approaches for exploring coexpression differences with small sample sizes. We apply these approaches in a species known for rapid adaptation—the Trinidadian guppy (Poecilia reticulata)—and find evidence for coexpression network differences at developmental and evolutionary timescales. Our findings suggest that flexibility in gene coexpression relationships could promote evolvability.

Keywords: gene expression, Poecilia reticulata, guppy, network analysis, evolvability

Introduction

Genes neither act nor evolve in isolation. Rather, genes are members of physically and functionally interacting networks. There is long-standing interest in how the nature of these interactions influences the extent to which gene sequence and expression changes are constrained at developmental and evolutionary timescales and thereby influence the evolvability of higher-order phenotypes. On the one hand, genes with many interaction partners (i.e. those that occupy central “hub” positions in a network) may be targets of developmental switches and evolutionary selection because they most strongly influence network output and phenotypic change (Chateigner et al. 2020; Friedman et al. 2020). Alternatively, expression changes in highly connected genes may be constrained by pleiotropic effects imposed by their many connections (Jeong et al. 2001; Hahn and Kern 2005), and developmental and evolutionary changes may therefore be more prevalent in peripheral genes that have lower connectivity and lower pleiotropic loads (Kim et al. 2007; Mähler et al. 2017). Despite their contrasting predictions, both hypotheses rely on the assumption that connections among genes—and therefore network topology—are stable. If the relationships among genes are instead flexible at developmental and evolutionary timescales, gene network position may not be a good predictor of evolvability.

The role of interactions among genes in evolutionary processes has historically been difficult to test because physical interaction networks were well-characterized in only very few species (e.g. protein networks in yeast [Jeong et al. 2001; Hahn et al. 2004; Jovelin and Phillips 2009]) and simultaneously surveying expression in large numbers of genes was challenging, if not impossible. The proliferation of next-generation sequencing technologies, and specifically RNA-sequencing (hereafter RNAseq), has removed these constraints from a technical perspective. Although network and coexpression analyses of RNAseq datasets remain less common than gene-wise differential expression analyses, studies in this area provide intriguing—albeit conflicting—results. In support of the idea that genes in central network positions are evolutionary constrained, while those in the network periphery are more evolvable, are studies demonstrating that centrality in coexpression networks is negatively correlated with divergence in gene expression (Warnefors and Kaessmann 2013; Mähler et al. 2017; Kuo et al. 2023) and sequence evolution (Josephs et al. 2017; Masalia et al. 2017; Harnqvist 2021), while genes in peripheral positions show greater magnitude expression divergence (Mähler et al. 2017) and signatures of positive selection (Kim et al. 2007; Mähler et al. 2017). In contrast, other studies provide evidence for a bias toward changes in the expression of and selection on genes with high centrality in coexpression networks (Koubkova-Yu et al. 2018; Chateigner et al. 2020; Friedman et al. 2020; Rennison and Peichel 2022), supporting the alternative hypothesis that central genes better predict phenotypic variation and are therefore targets of selection. Attempts to resolve these conflicting patterns include hypotheses that preferential divergence in central vs peripheral genes may be associated with different types of traits (Warnefors and Kaessmann 2013; Des Marais et al. 2017) or distinct selective modes and timescales (Luisi et al. 2015), or that intermediate levels of pleiotropy facilitate evolution (Hämälä et al. 2020). These ideas are intriguing yet also rely on the assumption that connections among genes are stable rather than flexible.

If the relationships among genes are instead flexible, then the degree of pleiotropy and its presumed consequences are not fixed, and flexibility in coexpression relationships among genes could reduce pleiotropic load (e.g. Wang et al. 2010; Pavlicev and Wagner 2012; Pavličev and Cheverud 2015). In other words, individual genes may be more able to change in expression and drive phenotypic change if their interactions with other genes can be altered to minimize off-target effects. Conversely, flexibility in coexpression relationships might improve the ability of underlying gene expression networks to buffer higher-level phenotypes through homeostatic change (e.g. Fischer et al. 2016a; Badyaev 2018; Hoke et al. 2019). Importantly, either scenario implies that the relationships among genes may themselves be targets of selection. Alternatively, changes in gene coexpression could represent drift and/or transcriptional noise if these changes do not amount to selectable differences at the network and/or organismal level. While this last scenario is less interesting from an adaptationist perspective, such “neutral” changes may nonetheless have consequences for evolutionary trajectories, for example by giving rise to cryptic variation that is revealed under novel environmental conditions (West-Eberhard 2003; McGuigan and Sgrò 2009; Paaby and Rockman 2014). In brief, all the above alternatives highlight that coexpression relationships could vary at developmental and evolutionary timescales with consequences in the short- and long-term.

Interest in coexpression analyses is being met by a growing collection of tools for (differential) gene coexpression analysis (Wang et al. 2017; Chowdhury et al. 2020; Tommasini and Fogel 2023). While these software packages make advanced network analyses accessible, they do not eliminate the statistical limitations of these approaches. These limitations arise primarily from the combination of small sample sizes and high-dimensional data (tens of thousands of genes) emblematic of transcriptomic studies. Sample sizes have increased as sequencing costs have decreased, yet per-group sample sizes commonly remain less than 10. Pooling samples across experimental groups or multiple studies can bring the total experimental sample size into the range recommended for network analyses (e.g. N = 20 by [Langfelder and Horvath 2008; Ballouz et al. 2015]); however, this solution is not viable when the goal is to test if experimental groups differ in coexpression structure. Moreover, correlations in expression derived from experimental groups that differ in their underlying transcriptional networks are particularly hard to interpret. This leaves researchers trapped between an experimental rock and hard place.

Like others, we are interested in using transcriptomic analyses to understand the biological basis of complex phenotypes, and specifically in exploring changes in individual genes as well as the interactions among genes. To this end, we characterized the effects of genetic background (high-predation [HP] vs low-predation [LP] populations) and developmental environment (rearing with and without predator chemical cues) on brain gene coexpression in 2 parallel, independent evolutionary lineages of Trinidadian guppies (Poecilia reticulata). In Trinidad, downstream, HP fish have repeatedly and independently colonized upstream, LP environments (Gilliam et al. 1993; Barson et al. 2009; Willing et al. 2010; Fraser et al. 2015), leading to parallel adaptive changes in life history, morphology, and behavior (Reznick et al. 1990, 2001; Endler 1995; Reznick 1997; Magurran 2005). In other words, each river drainage represents a naturally replicated experiment demonstrating parallel phenotypic adaptation.

Prior work has demonstrated that both evolutionary history with predators and developmental experience with predators shape life history (Torres Dowdall et al. 2012), morphology (Torres-Dowdal et al. 2012; Fischer et al. 2013; Ruell et al. 2013; Handelsman et al. 2014), physiology (Handelsman et al. 2013; Fischer et al. 2014), and behavior (Huizinga et al. 2009; Torres-Dowdall et al. 2012; Fischer et al. 2016b). Here, we extend a prior analysis (Fischer et al. 2021) that characterized developmental plasticity and genetic differences in transcriptome-wide expression levels using animals from different populations reared with and without exposure to predator cues (Fig. 1). To address plasticity and divergence in the coexpression patterns in these same data, we first asked whether connectivity patterns among genes differed based on evolved differences between lineages (i.e. river drainages), adaptive differences between HP and LP populations of the same lineage, and developmental differences arising from rearing with or without predators. In essence, these comparisons allowed us to test whether changes in gene coexpression are a feature of divergence across timescales (genetic and developmental) and parallel, independent evolutionary events. Following these comparisons, we further investigated whether connectivity influenced a gene's propensity for expression divergence.

Fig. 1.

Fig. 1.

Conceptual overview and interpretation. Top: Natural populations in different river drainages represent distinct evolutionary lineages. Middle: Overview of laboratory breeding design disentangling genetic and environmental effects. Bottom: Interpretation of comparisons of interest between experimental groups resulting from 2 × 2 breeding design shown above. HP = high-predation, LP = low-predation, pred+ = reared with predator chemical cues, pred− = reared without predator chemical cues. Modified from Fischer et al. (2021).

In pursuing these questions, we confronted the statistical challenges that standard approaches may fail to control Type I error and/or have low power when dimensionality is high (i.e. large number of genes) and sample size is small. From a technical perspective, we present a case study for those interested in (differential) coexpression with small per-group sample sizes. We discuss key challenges, set clear sample size guidelines to control Type I error while maintaining power, and provide rigorous statistical approaches for exploring coexpression differences even with small sample sizes. These methods do not replace other, popular approaches but rather complement them to ensure analysis outcomes can be rigorously interpreted. Using these statistical methods, we find evidence for differences in coexpression relationships based on both genetic background and rearing environment, suggesting that changes in the interactions among genes are associated with phenotypic divergence at developmental and evolutionary timescales.

Methods

Fish Collection and Rearing

Samples here are the same as those described in Fischer et al. (2021). Guppies used in this study were second-generation laboratory-born fish from unique family lines established from wild-caught HP and LP populations in the Aripo and Quare river drainages in the Northern Range mountains of Trinidad. At birth, we split second-generation siblings from each unique family line into rearing environments with (pred+) or without (pred−) predator chemical cues, and they remained in these environments until the completion of the experiment (Fig. 1) (as in Fischer et al. 2016b). Guppies were individually housed in 12:12 h light cycle and fed a measured food diet once daily. All experimental methods were approved by the Colorado State University Animal Care and Use Committee (Approval #12-3818A).

In brief, each drainage consists of a 2 × 2 factorial design that distinguishes genetic from developmental effects of predation (Fig. 1). Pairwise comparisons of biological relevance are: (1) HP pred− vs LP pred−, an experiment comparing populations reared in an environment lacking predator cues to identify genetic differences between populations; (2) HP pred+ vs HP pred−, to identify environmentally induced changes mimicking the situation in which HP fish colonize LP environments, i.e. “ancestral plasticity”; (3) LP pred− vs LP pred+, to identify environmentally induced changes comparable to the situation in which LP fish are washed downstream and a measure of whether ancestral plasticity is maintained in the derived population; (4) HP pred+ vs LP pred+, to identify genetic differences when fish are raised with environmental cues of predation. We also compared the same experimental groups across drainages (e.g. HP pred+ in Aripo drainage vs HP pred+ in Quare drainage) to understand differences associated with parallel, independent evolutionary lineages.

Tissue Collection and Processing

We collected brain tissue from mature males in the groups described above 10 min after lights on in the morning. For each experimental group, all samples represent individuals from unique family lines with siblings paired across rearing treatments. We extracted RNA from whole brains using the Qiagen RNeasy Lipid Tissue Mini Kit (Qiagen, Germany) and constructed a sequencing library for each individual using the NEBNext Ultra RNA Library Prep Kit for Illumina (New England Biolabs, Massachusetts, USA). For the Aripo dataset, 40 samples (N = 10 per group) were pooled with unique barcodes into 8 samples per sequencing library and each library was sequenced on a single lane. For the Quare dataset, 60 samples (N = 12 to 16 per group) were pooled into 3 sequencing libraries with 20 samples per pool and each library was sequenced in 2 separate lanes. Libraries were sequenced as 100 bp paired-end reads on an Illumina HiSeq 2000 at the Florida State University College of Medicine Translational Science Laboratory (Tallahassee, Florida, USA) in May 2014 (Aripo dataset) and January 2016 (Quare dataset).

Differential Expression Analysis

We reported results of differential expression analyses in another study (Fischer et al. 2021). We use normalized values and the resulting differential expression status (differentially expressed [DE] vs nondifferentially expressed [NDE]) as criteria in analyses performed here (see below). Briefly, we normalized read counts using DESeq2 (Love et al. 2014) and performed differential expression analysis using the lme4 package in R (github.com/lme4). We used generalized linear mixed models with a negative binomial link to accommodate our experimental design and data type. We included family and sampling week as random effects to identify DE genes for the fixed effects of population of origin (HP/LP), rearing environment (pred+/pred−), and their interaction. To label DE and NDE genes, we adjusted P-values for multiple hypothesis testing using a direct approach for false discovery rate (FDR) control (Storey 2002) as implemented in the fdrtool package in R (Strimmer 2008). We considered transcripts DE if the adjusted P-value was <0.05, and all other genes NDE.

Our previous work reported brain gene expression differences based on genetic and developmental influences in both drainages (Fischer et al. 2021). We found that genes exhibiting expression changes in response to rearing differences were also more likely to be DE between HP and LP populations. While this pattern was evident in both river drainages, sets of DE genes were largely nonoverlapping between lineages (Fischer et al. 2021). For 2-sample covariance tests and network comparisons (see below), we categorized genes as DE or NDE based on whether they were DE for the effect of population of origin within each drainage.

Preliminary Analyses

Our initial approach was to characterize and explore gene coexpression using the popular weighted gene correlation network analysis (WGCNA) package in R (Langfelder and Horvath 2008). We calculated module preservation scores using the methods implemented in WGCNA, which combine a number of difference preservation statistics to calculate a summary preservation score (Langfelder et al. 2011). We found that, in both drainages, ∼50% of gene coexpression modules were not preserved across experimental groups (supplementary information A, Supplementary Material online). In other words, we identified substantial differences in network structure between groups, suggesting that the common practice of reconstructing coexpression networks by pooling samples across groups may not be valid. In brief, our preliminary analyses using WGCNA underscored the need for a statistical method to discern network differences across groups when dealing with small group sizes (N = 10 to 12 in our case). We sought to address these issues through the alternative statistical approaches detailed below, and further in the Methods section and supplementary information B, Supplementary Material online. Our approaches are not intended as a replacement for WGCNA and other similar packages but can be used in conjunction to ensure that WGCNA outputs can be rigorously interpreted.

Challenges From Small Sample, High-Dimensional Data

In wanting to understand differences in covariance structure and changes in network architecture between our experimental groups, we faced 2 fundamental challenges. First, controlling Type I error becomes difficult when comparing large networks or covariance structures with limited sample sizes, often leading to spurious discoveries and unreliable results. Also, detecting true effects for separating complicated networks or covariance structures in ultra-high-dimensional datasets is difficult due to low statistical power; a common phenomenon in coexpression analysis of RNAseq studies. While there are commonly accepted methods for 2 sample covariance tests, they either fail to control the Type I error rate with extremely small sample size (e.g. N < 25 per group) or have substantially low power (see supplementary information B, Supplementary Material online for simulations) (Li and Chen 2012; Cai et al. 2013; Chang et al. 2017; Yu et al. 2022). Second, comparison of multiple networks is nontrivial, particularly when the networks are of different sizes or have unmatching nodes (Tang et al. 2017; Agterberg et al. 2020; Qi et al. 2024). For gene coexpression analysis these issues apply, for example, when sample sizes vary between groups and are small due to the limitation of experiment constraints, and when comparing subsets of genes of interest that vary in size (e.g. comparing DE vs NDE gene sets, or differently sized coexpression modules) (Agterberg et al. 2020; Alyakin et al. 2024; Jin et al. 2024; Qi et al. 2024).

To highlight these challenges, we first conducted simulation studies using existing high-dimensional methods designed for valid inference on comparing large covariance structures with controlled Type I error rate and reasonable power (Li and Chen 2012; Cai et al. 2013; Chang et al. 2017; Yu et al. 2022), before implementing our method and examining the real data set (see below). We summarize the outcomes of the simulation studies and exploratory comparisons here and refer the interested reader to additional details provided in supplementary information B, Supplementary Material online.

We found that Type I error rates for existing tests were uncontrolled for sample sizes of N < 50 per group, even when the number of genes was relatively small (250 genes, orders of magnitude smaller than what is typical for RNAseq analysis) (supplementary fig. S1, Supplementary Material online). In addition to uncontrolled Type I error, the existing methods were substantially underpowered for small sample sizes. Specifically, the empirical power was overall low (<0.25) for sample sizes N < 30 per group (supplementary fig. S2, Supplementary Material online). These issues plagued our dataset, which is representative of most RNAseq studies exploring connections between gene expression and behavior (per-group sample size N < 10, ∼20,000+ genes). Importantly, these issues are not resolved by subsampling the data to include a smaller number of genes (supplementary fig. S3, Supplementary Material online), an approach commonly deployed by network analysis packages (e.g. filtering for the 5,000 to 8,000 most variable genes in WGCNA or the approach of [Qiu et al. 2021]).

Our New High-Dimensional Covariance Comparison

The task of determining whether a common covariance structure can be assumed between 2 groups for downstream analysis can be addressed using 2-sample covariance tests. If this assumption is supported by the data, pooling the samples across groups can improve statistical efficiency in recovering the underlying network and provide a better understanding of its structure. However, from the above simulations, it was clear that existing approaches to compare large covariance structures fail even when using only a small subset of genes. To overcome these issues, we extended the random projection-based covariance test (Wu and Li, 2020) to develop a new 2-sample comparison method suitable for our data (see the Results section and supplementary information B, Supplementary Material online).

We deployed our newly developed random projection-based tests on residuals from the linear mixed model described above and in Fischer et al. (2021). We used residuals to remove the mean effects of population of origin and rearing environment, thereby allowing us to focus on the underlying coexpression patterns among genes, rather than differences in mean expression, when studying dependency structures among genes. We focused on pairwise comparisons of biological interest (Fig. 1 bottom row and described above). Within each drainage, we compared (1) HP pred− vs LP pred−, (2) HP pred+ vs HP pred−, (3) LP pred− vs LP pred+, and (4) HP pred+ vs LP pred+. We also compared the same experimental groups across drainages (e.g. HP pred+ in Aripo drainage vs HP pred+ in Quare drainage) to understand differences associated with parallel, independent evolutionary lineages. For both within and between drainage comparisons, we considered the 4 comparisons jointly to control family-wise error rate.

Correlation Network Comparisons

We used correlation network analyses to characterize differences in network structure that accompany the changes in covariance matrices. Because differential expression might drive differential coexpression patterns, we tested whether DE vs NDE genes differed in their connectivity with other genes and analyzed genetic and developmental influences on network structure separately in DE and NDE gene sets. We used the DE and NDE characterizations determined previously using generalized linear models as summarized above, although we note that the sets of DE genes were largely nonoverlapping between evolutionary lineages (drainages). A challenge with this analysis was the lack of consensus to define networks, in addition to the fact that derived networks will usually have very different sizes and unmatched nodes (i.e. NDE genes far outnumber DE genes, a single gene is inherently in only 1 category, and the DE genes in the Quare drainage outnumbered those in Aripo).

The network comparisons involved 2 steps: (i) reconstruction of the coexpression network using gene-wise correlations and (ii) comparing 2 networks of different sizes. To achieve (i), we first tested whether the correlation between each pair of genes was zero, while controlling the FDR using the method from Cai and Liu (2016), as detailed in supplementary information B, Supplementary Material online. Using this method, an undirected edge is drawn between any 2 genes (nodes) with nonzero correlation, forming the coexpression network. To assess the constructed network's sensitivity to different FDR levels, we compared network summary plots at multiple FDR cutoffs (α = 0.01, 0.05, 0.1). If 2 networks are distinct, their summary plots will differ (Maugis et al. 2017). The network summary plots (supplementary fig. S6, Supplementary Material online) suggest that the correlation-based coexpression network is relatively insensitive to different FDR levels. Therefore, we used a coexpression network with α = 0.05 for all subsequent analyses.

We compared network pairs of interest to examine their structural differences, using a network summary plot and 2-sample network tests based on relative frequencies of different subgraphs adjusted for the sparsity (density of edges) of networks (Maugis et al. 2017; Shao et al. 2025). A major concern in comparing DE vs NDE networks, which has been largely overlooked in the literature, is that the different collections of genes in these 2 sets (i.e. DE genes are a small subset of all genes) make the 2 corresponding networks have different numbers of unmatchable nodes (Qi et al. 2024). To address this, we adopted the network comparison test proposed by Shao et al. (2025), which accommodates networks of different sizes by analyzing their network moments for specific motifs using the difference of 2 subgraph densities adjusted for their edge densities (additional details in supplementary information B, Supplementary Material online). Comparing subgraph densities provides a measure of network topology and sparsity that can be visualized using network summary plots. To help readers interpret plots of the real data, we provide a set of example networks and their corresponding network summary plots (Fig. 2 and Glossary).

Fig. 2.

Fig. 2.

Example network summary plots. These examples are intended to help readers unfamiliar with network summary plots interpret summary plots of the real data. Each row shows a type of synthetic network (visualized using an igraph with 100 vertices), its corresponding network summary plot (based on a 1,000-vertex network with subgraph densities computed from 200-node subsamples), and a brief interpretation. Vertex sizes (i.e. colored dots) in the visualizations are proportional to their degrees (i.e. number of connections). Tree-like structures (v-shapes, also known as 2 stars) and cycles (triangles, squares, pentagons, etc.) in the summary plot provide information on network topology. In the network summary plots, the y-axis value (higher values indicate higher edge densities), the overall pattern (which subgraph type is most abundant) and the values (ratio of abundance between subgraph types) matter. We note that networks constructed under different models look more or less different depending on the specific parameters of the real or simulated data. See also supplementary information, Supplementary Material online, for additional discussion of network models chosen here.

We made various comparisons of the different subgraphs to understand differences based on rearing environment, population of origin, and evolutionary lineage. We conducted 1-sided comparisons for each of the 3 subgraph motifs (v-shape, triangle, and 3-star). Comparisons fell into 4 general categories: (i) comparing DE and NDE gene sets within each experimental group, (ii) comparing developmental differences (i.e. rearing with [pred+] or without [pred−] predators) for DE and NDE gene sets, (iii) comparing populations differences (i.e. HP vs LP) for DE and NDE gene sets, and (iv) comparing experimental groups across evolutionary lineages (i.e. Aripo vs Quare drainage) for DE and NDE gene sets.

Results

Changes in Coexpression Networks based on Genetics and Environment

We were interested in comparing gene coexpression patterns based on genetic background and rearing environment. To overcome problems associated with limited sample sizes yet high-dimensional data, we used random projection-based tests to compare covariance structures between experimental groups (Wu and Li, 2020). Instead of testing 2 large covariance matrices directly, these tests compute random projections of the data into lower dimensional spaces and test the equality of variances. Previous work (Wu and Li, 2020) deployed this approach in 1-dimensional space, and we developed a generalized version for multidimensional space (further details in supplementary information B, Supplementary Material online). Using simulations, we confirmed this approach controls Type I error (supplementary fig. S4, Supplementary Material online) while maintaining power (supplementary fig. S5, Supplementary Material online).

Applying our method to the real data, we considered the set of all genes that passed filtering criteria (Aripo: 13,446; Quare: 14,379). We found significant differences in coexpression structure between HP and LP fish reared with predators (HP pred+ vs LP pred+) in both drainages (Fig. 3, Table 1). Analysis of the Quare dataset found a marginally significant difference between HP fish reared with and without predators (HP pred+ vs HP pred−). We also compared the coexpression structures between the same treatment groups across drainages. Here, we found significant differences in all comparisons (Table 2). In short, when considering all genes in the dataset, we found evidence for changes in gene coexpression based on evolutionary lineage (drainage), genetic background (population), and rearing environment.

Fig. 3.

Fig. 3.

Visualization of coexpression differences between experimental groups and evolutionary lineages. Heatmaps provide a visualization of coexpression patterns as Pearson's correlations for each experimental group (i.e. combination of genetic background and rearing environment; see Fig. 1) within each evolutionary lineage (Aripo and Quare drainage). Gene order is determined by hierarchical clustering of the HP pred− group, meaning that the same position in each heatmap represents the correlation of identical pairs of genes. For ease of visualization and computation, only the 1,000 most variable genes are shown. Colors indicate correlation strength from 1 (dark), through 0 (white), to −1 (red).

Table 1.

Approximated P-values from random projection tests comparing covariance structure for all genes (DE and NDE) between treatment groups

ARIPO drainage QUARE drainage
HP pred+ vs HP pred− 1.0000 0.0639
HP pred+ vs LP pred+ 0.0149 <0.0001
HP pred− vs LP pred− 1.0000 0.9568
LP pred+ vs LP pred− 0.1258 0.9943

Table 2.

Approximated P-values from random projection tests comparing covariance structure across drainages

ARIPO vs QUARE P-value
HP pred− <0.0001
HP pred+ 0.0001
LP pred− 0.0001
LP pred+ 0.0107

Coexpression Networks and Differential Expression

To further characterize the changes in coexpression among genes, we compared the prevalence of network motifs across treatment groups. Because we reasoned that divergence in expression levels might itself impact networks, we first addressed whether coexpression networks differed between DE and NDE genes. We performed these comparisons separately for each treatment group, given group-level differences in covariance structures detailed above. Further, because our networks include gene sets of different sizes, we adopted network comparisons of subgraph densities adjusted for overall edge density (Shao et al. 2025).

We compared the v-shape (subgraphs with 3 nodes and 2 edges), triangle (subgraphs with 3 nodes and 3 edges), and 3-star (subgraphs with 4 nodes and 3 edges) prevalence within networks relative to the prevalences expected given edge densities (see visualizations in Figs. 2, 4, and supplementary fig. S9, Supplementary Material online). We considered these specific subgraphs commonly used in the network literature as metrics of connectivity and clusterability (Tang et al. 2017; Agterberg et al. 2020; Qi et al. 2024) because all more complex motifs can be deconstructed to these components, and this set of motifs is therefore sufficient to capture network differences such that a greater abundance of complex motifs indicates higher network connectivity and density. Finding different patterns in DE and NDE gene sets within treatment groups (Fig. 4, supplementary table S10, Supplementary Material online), we compared network motifs among treatment groups separately in the DE and NDE gene sets. Overall, we found differences in the prevalence of v-shape, triangle, and 3-star motifs in coexpression networks based on genetic and environmental influences, but without a simple rule of directionality (Fig. 4, Tables 3 and 4).

Fig. 4.

Fig. 4.

Network summary plots of correlations networks. Plots provide a visualization of network topology and sparsity for networks of DE (yellow) and NDE (teal) genes that differed in mean expression levels between the 2 populations in each river drainage. Subgraph shapes are depicted on the x-axis. See Tables 3 and 4 for outcomes of statistical comparisons between groups.

Table 3.

Network comparisons of treatment groups for DE and NDE genes across datasets

ARIPO drainage QUARE drainage
V-shape Triangle 3-star V-shape Triangle 3-star
graphic file with name msaf194il1.jpg graphic file with name msaf194il2.jpg graphic file with name msaf194il3.jpg graphic file with name msaf194il4.jpg graphic file with name msaf194il5.jpg graphic file with name msaf194il6.jpg
HP < LP
DE Pred− 0.0003 0.5098 0.3972 <0.0001 <0.0001 0.0088
Pred+ 0.7982 0.3333 0.2545 <0.0001 <0.0001 0.1664
NDE Pred− <0.0001 <0.0001 <0.0001 <0.0001 1 1
Pred+ 0.0327 0.3644 0.0021 <0.0001 1 1
HP > LP
DE Pred− 0.9997 0.4902 0.6028 1 1 0.9912
Pred+ 0.2018 0.6667 0.7455 1 1 0.8336
NDE Pred− 1 1 1 1 <0.0001 <0.0001
Pred+ 0.9673 0.6356 0.9979 1 <0.0001 <0.0001
Pred+ < Pred−
DE HP 0.5407 0.0001 0.0005 <0.0001 1.0000 1.0000
LP <0.0001 0.0013 0.0172 <0.0001 1.0000 1.0000
NDE HP <0.0001 <0.0001 <0.0001 <0.0001 1 1
LP <0.0001 <0.0001 <0.0001 1 1 1
Pred+ > Pred−
DE HP 0.4593 0.9999 0.9995 1 <0.0001 <0.0001
LP 1 0.9987 0.9828 1 <0.0001 <0.0001
NDE HP 1 1 1 1 <0.0001 <0.0001
LP 1 1 1 <0.0001 <0.0001 <0.0001

Comparisons of sparsity-adjusted subgraph densities tested the alternatives that gene networks had smaller or larger subgraph density than other networks based on differences in population of origin and rearing environment. P-values from 1-sided alternative tests are reported for the v-shape, triangle, and 3-star subgraph types.

Table 4.

Network comparisons of datasets within treatment groups for DE and NDE genes

ARIPO vs QUARE
V-shape Triangle 3-star
graphic file with name msaf194il7.jpg graphic file with name msaf194il8.jpg graphic file with name msaf194il9.jpg
Aripo < Quare
DE HP pred+ <0.0001 <0.0001 <0.0001
HP pred− <0.0001 0.0053 <0.0001
LP pred− <0.0001 <0.0001 <0.0001
LP pred+ <0.0001 <0.0001 <0.0001
NDE HP pred+ <0.0001 <0.0001 <0.0001
HP pred− <0.0001 <0.0001 0.9928
LP pred− <0.0001 1 1
LP pred+ <0.0001 <0.0001 0.2933
Aripo > Quare
DE HP pred+ 1 1 1
HP pred− 1 0.9947 1
LP pred− 1 1 1
LP pred+ 1 1 1
NDE HP pred+ 1 1 1
HP pred− 1 1 0.0072
LP pred− 1 <0.0001 <0.0001
LP pred+ 1 1 0.7067

Comparisons of sparsity-adjusted subgraph densities tested the alternatives that gene networks for each treatment group in 1 drainage had smaller or larger subgraph density than networks for the same treatment group in the other drainage. P-values from 1-sided alternative tests are reported for the v-shape, triangle, and 3-star subgraph types.

For groups that differed based on developmental experience, we found pronounced coexpression differences in both drainages but generally in opposite directions: in the Aripo drainage more complex subnetwork motifs were more abundant in fish reared without predator cues while in the Quare drainage complex motifs were more abundant in fish reared with predators (Fig. 4, Table 3). These patterns were largely concordant for DE and NDE gene sets within both drainages.

Population divergence in coexpression networks was also apparent, again with distinct patterns in the 2 lineages. In the Aripo drainage, HP fish from both rearing environments had generally fewer complex network motifs among NDE genes and no differences among DE genes (Fig. 4, Table 3). In the Quare drainage, this pattern was flipped, with more complex motifs among NDE genes in HP fish, and more complex motifs among DE genes in LP fish (Fig. 4, Table 3).

Differences between the 2 drainages were also pronounced, with the Aripo drainage having overall fewer complex network motifs, apart from a few differences in the opposing direction for triangles and 3-star motifs among NDE genes (Fig. 4, Table 4).

Discussion

Our goal in this study was to understand how genetic background and rearing environment shape relationships among genes. We previously characterized expression changes at the level of individual genes (Fischer et al. 2021), and here we were interested in exploring changes in coexpression patterns among genes. Exciting from a biological perspective, these questions present statistical challenges. We applied random projection tests to assess whether overall covariance structures differ between groups and network summary plots and network comparison tests to examine and compare the topology of constructed networks. Our findings suggest that coexpression patterns are flexible at evolutionary and developmental timescales. We discuss the implications of our work from both statistical and biological angles.

Overcoming Sample Size Constraints in High-Dimensional Data

Gene expression studies remain plagued by small per-group samples sizes and high dimensionality. Network construction is far from trivial, if not problematic, under these conditions, especially when network structure—and not just network expression level—differs among experimental groups. In our own study, we had an overall sample size of N = 98 individuals, well above the recommendation of N = 30 for network construction. However, this total sample size includes samples from 2 drainages and 4 experimental groups, and—based on our analyses here and preliminary analyses using the WGCNA package—we found evidence that network structure differs between experimental groups and even more strongly between drainages. These differences are of key biological interest as they suggest that expression relationships among genes (i.e. network structure) are subject to developmental plasticity and evolutionary divergence. However, if network structure differs across experimental groups, then networks must be constructed separately for each experimental group to avoid construction of “average” networks that can obscure differences of biological interest and lead to biased conclusions (Zhao et al. 2014; Shojaie 2021; Li et al. 2023). To take an extreme example, if 2 genes have opposing correlations of the same magnitude in 2 groups, the average correlation across groups will be zero. Thus, it is the per-group sample size that is most important for network construction and comparison when gene coexpression patterns are of interest.

While our per-group sample size of N = 10 to 15 is relatively large for an RNAseq study, it is below the recommended threshold for network construction, such as the minimum sample of N = 20 suggested for RNAseq analyses by Langfelder and Horvath (2008) and Ballouz et al. (2015). As we illustrate, these sample sizes are surprisingly inadequate for recently developed statistical tests thought to be robust against high dimensionality, to control Type I error, and to maintain power. Indeed, from our simulation experiments, most common methods require N > 50 per group to retain the generally accepted nominal significance levels of 0.05 and satisfactory power exceeding 0.8. Importantly, the potential misinterpretations resulting from these shortcomings are not systematic (i.e. directionally biased) and therefore difficult to predict.

To address the challenges in 2-sample covariance testing, we employed a random projection-based test rather than existing methods grounded in large-sample asymptotic theory. As shown in supplementary information B, Supplementary Material online, our method effectively controls the Type I error rate and maintains reasonable power across various choices of projection dimensions and numbers of random projections, even when the sample size is as small as N = 10. The test results reveal statistically significant differences in the covariance matrices for several group pairs, indicating structural differences in their underlying networks. Our statistical approach uses non-data-driven projections, which offer several advantages: (i) unlike data-driven methods such as principal component analysis (PCA) or random-skewers or eigentensor decomposition-biased approaches (Wang et al. 2019; Hu et al. 2025), the random projection-based test naturally preserves the null hypothesis; (ii) in the resulting low-dimensional projected space, powerful and robust test statistics, such as the U-statistics considered in this paper, can be easily constructed; and (iii) multiple random projections yield conditionally independent tests, whose aggregation (e.g. via the maximum operator in this work) enhances power. While we demonstrate the utility of this procedure through simulations, a formal theoretical investigation of these advantages is warranted and is left for future statistical research.

As a growing number of studies consider how interactions among genes shape phenotypic differences across timescales, we present our work as a case study to increase awareness of these limitations, as complementary statistical approaches to those commonly used, and in hopes that others will consider these issues in experimental design and analysis.

Evidence for Genetic and Developmental Differences in Gene Coexpression

Using the robust estimation methods we derived, we first identified differences based on both genetic background and developmental environment, with the most pronounced differences in both drainages between HP and LP fish reared with predators (HP pred+ vs LP pred+). This comparison represents the ancestral population adapted to life with predators (HP pred+) vs the derived LP population adapted to predator-free environments and suddenly re-exposed to predator cues (e.g. as when fish are washed downstream; LP pred+). Fish adapted to a LP life are poorly equipped to deal with the sudden stressors of predation. Indeed, we previously found HP pred+ fish to be behaviorally least variable and LP pred+ fish to be behaviorally most variable—both in single behaviors and in the correlations among them (Fischer et al. 2016b). In light of findings here, we suggest that disruption of gene coexpression networks could contribute to unpredictable behavioral patterns and correlations.

Beyond genetic and developmental differences within each evolutionary lineage, we found that differences were also ubiquitous when comparing between the 2 lineages, and more evidence for coexpression differences in the Quare as compared to the Aripo drainage. We suggest these patterns arise in part from the extent of genetic divergence between populations: populations within each drainage diverged <10,000 years ago with HP and LP populations in the Quare drainage showing greater genetic (Willing et al. 2010) and gene expression (Fischer et al. 2021) divergence than those in the Aripo drainage. The 2 drainages represent distinct evolutionary lineages that diverged ∼600,000 years ago (Willing et al. 2010). The importance of genetic background in shaping evolutionary trajectories is highlighted by our previous work demonstrating distinct underlying mechanisms associated with parallel phenotypic adaptation in guppies from distinct evolutionary lineages (Fischer et al. 2021). Similar mechanistic flexibility has also been demonstrated in other systems (Cordero et al. 2018; Jacobs et al. 2020), including those known for parallel phenotypic evolution (e.g. Laporte et al. 2015; Hanson et al. 2017; Bolnick et al. 2018). Our findings here extend these observations from the expression of individual genes to coexpression patterns among genes, suggesting that alternative gene expression network configurations can give rise to shared organism-level phenotypes.

The prevalence of coexpression differences in our dataset highlights the need to construct coexpression networks individually across experimental groups. If connectivity diverges at evolutionary timescales and shifts with developmental experience, then constructing a single network from pooled samples may confound differential expression differences with differences in connectivity. For example, if a subset of DE genes has higher connectivity in experimental group A vs experimental group B, then connectivity patterns associated with differential gene expression may be obscured when an average network with intermediate connectivity is constructed. Conversely, and nonmutually exclusive, genes DE between Groups A and B could lead to the appearance of strong expression correlation (i.e. strong connectivity) overall, when in fact the within treatment correlations are weak. In brief, pooling samples for network construction when underlying networks in fact differ across experimental groups may obscure precisely the differences researchers are interested in testing.

We note that divergence in gene expression and by extension coexpression reflects both adaptive processes as well as nonadaptive and neutral processes (Whitehead and Crawford 2006; Lynch 2007). First, LP populations of guppies are particularly susceptible to drift, founder effects, and inbreeding depression because they are established by a small number of individuals and experience relaxed predator selection (Magurran 2005; Barson et al. 2009; Willing et al. 2010). Second, whether initial gene expression changes are dominated by adaptive or nonadaptive processes, further expression changes may be homeostatic or compensatory, buffering higher-level phenotypes from evolutionary change, and thereby leading to alternative transcriptional configurations associated with shared organismal phenotypes (e.g. Abouheif and Wray 2002; Crawford and Oleksiak 2007; Fischer et al. 2016b). Finally, some changes in coexpression may simply reflect transcriptional noise that is filtered out by downstream regulatory processes (e.g. translational regulation). Though certainly present in our dataset, transcriptional noise is unlikely to be a major driver of our results because random expression changes would need to be concordant across individuals to be detected as changes in coexpression. At present, we cannot distinguish between adaptive, nonadaptive, homeostatic, and neutral processes. Indeed, a combination of all factors most likely shapes gene expression and coexpression, and distinguishing among alternatives remains a challenge in transcriptomic work. Developing high-dimensional methods such as ours to detect coexpression differences in small sample sizes is an important first step in understanding the causes and consequences of coexpression changes across developmental and evolutionary timescales.

Probing Links Between Network Coexpression Differences, Developmental Plasticity, and Evolutionary Divergence

By comparing the prevalence of subgraph motifs across networks, we were able to make comparisons of network connectivity and clusterability among experimental groups. Our results indicate widespread differences in gene coexpression network structure (both connectivity and clusterability) at developmental and evolutionary timescales; however, these differences do not follow a directional rule. Prior work proposed alternative directional hypotheses, (1) genes in central network positions are evolutionarily constrained and hence less likely to diverge in mean expression levels (e.g. Jeong et al. 2001; Hahn and Kern 2005), and (2) expression changes in genes occupying central hub positions are more likely to have phenotypic consequences and hence more likely to be targets of selection (e.g. Chateigner et al. 2020; Friedman et al. 2020). Rather than supporting either consistently increased or decreased connectivity or clusterability, our findings provide a potential explanation for conflicting evidence surrounding the ongoing debate about whether hub genes are more or less likely to evolve: if coexpression relationships themselves can change both on developmental and short-term evolutionary time scales, then the constraints and/or advantages imposed by high vs low connectivity are not fixed. As a result, the association between network position and differential expression may vary across traits, taxa, and environments, and therefore studies.

We detected extensive plasticity and divergence in connectivity and clusterability in both DE and NDE gene sets, highlighting that network changes do not have a simple association with differential expression levels of individual genes. We analyzed DE and NDE networks separately with the idea that divergence and plasticity in mean expression levels of DE genes might be driving the covariance differences we report. As a simple example, covariance between 2 genes in 1 experimental group might be disrupted by the molecular or cellular mechanisms that increase or decrease the expression level of 1 of those genes during population divergence. Our results in Tables 3 and 4 refute the notion that differential gene expression is the primary driver of network differences, as even the genes that lacked detectable differences in mean expression levels (NDE genes) exhibited widespread divergence in coexpression subgraphs. This pattern raises key questions about the sources of coexpression flexibility, prompting future work analyzing the molecular and cellular mechanisms that drive these overall patterns, including changes in transcriptional regulation, mRNA stability, and changing proportions of cell types.

Conclusions

Understanding how underlying genetic architecture shapes the maintenance and evolution of complex traits is a fundamental goal of biological research. Over the past 2 decades, the explosion of next-generation sequencing technologies has allowed us to move beyond the genetic scale—considering 1 or a few genes or loci—to genomic scales—considering thousands to tens of thousands of genes or loci. Among the key advances afforded by these approaches are the ease of conducting broadscale, exploratory studies; the opportunity to characterize underlying mechanisms in nonmodel species; and the ability to consider genes in the context of their interactions. As a growing number of studies consider how interactions among genes shape phenotypic differences across timescales, we provide a case study to increase awareness of limitations and provide suggestions for analysis.

From a practical perspective, random projection tests and correlation network comparisons—whether via network summary plots (Maugis et al. 2017) or network comparison tests (Shao et al. 2025)—are not sequential steps but serve distinct analytical purposes. The random projection test is used to evaluate whether covariance matrices differ between groups, providing a formal statistical test of differences in underlying dependency structures directly from the data, without constructing or relying on any underlying networks. This test is appropriate for assessing the overall question of whether any differences in covariance structure exist between groups and informing decisions on subsequent analysis strategies. In contrast, upon the construction of correlation networks (using the method of Cai and Liu 2016), network summary plots and network comparison tests are employed to explore and characterize the structures of underlying networks. The network summary plots serve as exploratory tools, similar to scatter plots or histograms, for visualizing and intuitively comparing network topologies. While the summary plots can reveal structural patterns and differences, they do not provide formal statistical tests. To formally draw inference on network structure differences, we use the network comparison test, which offers a statistical evaluation of differences in network topology (e.g. in triangle density) and remains valid even for networks of different sizes and without requiring repeated observations. Thus, it is not required for the random projection test to be significant before conducting network comparisons, nor are the 2 steps intended to be applied sequentially as part of a pipeline. Instead, they complement each other by addressing distinct aspects of dependency structure.

Our findings provide intriguing evidence of extensive coexpression differences at multiple timescales in a species known for rapid adaptation and suggest that flexibility in gene coexpression relationships across time scales may contribute to evolutionary potential. This idea, its generality, and its consequences for adaptation will be revealed by more studies with larger sample sizes and new statistical approaches. We further identify divergence and plasticity in coexpression of genes that do not show differential expression, a pattern that raises many questions about the mechanistic basis of coexpression patterns and their divergence. We argue that pooling samples across experimental groups may obscure precisely the differential expression and connectivity differences that we seek to characterize. Determining whether and how relationships among genes change at developmental and evolutionary timescales has consequences for our understanding of how underlying mechanisms shape flexibility and robustness in higher-order phenotypes, how animals adapt to novel and changing environments, and how behavior and physiology are regulated in health and disease.

Supplementary Material

msaf194_Supplementary_Data

Acknowledgments

We thank the members of the Colorado State University Guppy Group for fish care and help with tissue collection and processing. We thank 3 anonymous reviewers and the editor for thoughtful comments that greatly improved the manuscript.

Glossary

Glossary
adjacency matrix
A common means of representing a network using a matrix in which the entries correspond to edges of the network.
graph/network (G)
Made up of nodes (vertices) and the connections between them (edges). Networks can be defined by a vertex set (V) and an edge set (E) as G: = (V,E).
network summary plot
A plot that provides a scalable and model-free graphical summary of undirected networks. It characterizes network topology and sparsity (see also Fig. 2).
random projection
A technique used to reduce the dimensionality of a set of points which lie in Euclidean space.
simple network
A network that has no self-loops and at most 1 edge between any pair of vertices.
Subgraph
A graph formed from a subset of the vertices and edges of a graph G. The subgraph must include all endpoints of the edges in the subgraph. The number of subgraphs can be counted by matrix operations on the adjacency matrix. The subgraph density is the proportion of subgraph counts relative to the number of subgraphs in a complete graph in which all pairs of vertices are connected.
undirected network
A network in which the edges have no direction, meaning no distinct between in and out.

Contributor Information

Eva K Fischer, Department of Neurobiology, Physiology and Behavior, University of California Davis, Davis, CA 95616, USA.

Youngseok Song, Department of Statistics, West Virginia University, Morgantown, WV 26506, USA.

Wen Zhou, Department of Biostatistics, School of Global Public Health, New York University, New York, NY 10003, USA.

Kim L Hoke, Department of Biology, Colorado State University, Fort Collins, CO 80523, USA.

Supplementary Material

Supplementary material is available at Molecular Biology and Evolution online.

Author Contributions

E.K.F. Y.S., W.Z., K.L.H.

E.K.F. and K.L.H. conceived of the study; E.K.F. collected samples and performed molecular work, gene expression mapping, transcript abundance estimation, and preliminary differential expression analyses; Y.S. and W.Z. devised and performed statistical analyses with input from E.K.F. and K.L.H.; E.K.F. and Y.S. performed data visualization; E.K.F. wrote the manuscript with contributions from all authors.

Conflict of Interest

The authors declare no conflicts of interest.

Funding

This work was supported by the National Science Foundation (United States) DDIG-1311680 (to E.K.F.), RCN IOS-1256839 (to E.K.F.), IOS-1354755 (to K.L.H.), IOS 1922701 (to W.Z.), U.S. Department of Energy (United States) DE-SC0018344 (to W.Z.), and National Institutes of Health (United States) R01GM144961 (to W.Z.).

Data Availability

Raw sequencing reads are available through the NCBI SRA repository (PRJNA601479). R code for statistical analyses are available on GitHub (https://github.com/EnigmaSong/GeneFlexibilityStudy).

References

  1. Abouheif  E, Wray  G. Evolution of the gene network underlying wing polymorphism in ants. Science. 2002:297:249–252. 10.1126/science.1071468. [DOI] [PubMed] [Google Scholar]
  2. Agterberg  J, Tang  M, Priebe  CE. Nonparametric two-sample hypothesis testing for random graphs with negative and repeated eigenvalues. arXiv, arXiv:2012.09828, preprint: not peer reviewed.
  3. Alyakin  AA, Agterberg  J, Helm  HS, Priebe  CE. Correcting a nonparametric two-sample graph hypothesis test for graphs with different numbers of vertices with applications to connectomics. Appl Netw Sci. 2024:9(1):1. 10.1007/s41109-023-00607-x. [DOI] [Google Scholar]
  4. Badyaev  AV. Evoulutionary transitions in controls reconcile adaptation with continuity of evolution. Semin Cell Dev Biol. 2018:88:36–45. 10.1016/j.semcdb.2018.05.014. [DOI] [PubMed] [Google Scholar]
  5. Ballouz  S, Verleyen  W, Gillis  J. Guidance for RNA-seq co-expression network construction and analysis: safety in numbers. Bioinformatics. 2015:31(13):2123–2130. 10.1093/bioinformatics/btv118. [DOI] [PubMed] [Google Scholar]
  6. Barson  NJ, Cable  J, Van Oosterhout  C. Population genetic analysis of microsatellite variation of guppies (Poecilia reticulata) in Trinidad and Tobago: evidence for a dynamic source-sink metapopulation structure, founder events and population bottlenecks. J Evol Biol. 2009:22(3):485–497. 10.1111/j.1420-9101.2008.01675.x. [DOI] [PubMed] [Google Scholar]
  7. Bolnick  DI, Barrett  RDH, Oke  KB, Rennison  DJ, Stuart  YE. (Non)parallel evolution. Annu Rev Ecol Evol Syst. 2018:49(1):303–330. 10.1146/annurev-ecolsys-110617-062240. [DOI] [Google Scholar]
  8. Cai  T, Liu  W, Xia  Y. Two-sample covariance matrix testing and support recovery in high-dimensional and sparse settings. J Am Stat Assoc. 2013:108(501):265–277. 10.1080/01621459.2012.758041. [DOI] [Google Scholar]
  9. Cai  TT, Liu  W. Large-scale multiple testing of correlations. J Am Stat Assoc. 2016:111(513):229–240. 10.1080/01621459.2014.999157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Chang  J, Zhou  W, Zhou  WX, Wang  L. Comparing large covariance matrices under weak conditions on the dependence structure and its application to gene clustering. Biometrics. 2017:73(1):31–41. 10.1111/biom.12552. [DOI] [PubMed] [Google Scholar]
  11. Chateigner  A, Lesage-Descauses  MC, Rogier  O, Jorge  V, Leplé  JC, Brunaud  V, Roux  CP, Soubigou-Taconnat  L, Martin-Magniette  ML, Sanchez  L, et al.  Gene expression predictions and networks in natural populations supports the omnigenic theory. BMC Genomics. 2020:21(1):1–16. 10.1186/s12864-020-06809-2. [DOI] [Google Scholar]
  12. Chowdhury  HA, Bhattacharyya  DK, Kalita  JK. (Differential) co-expression analysis of gene expression: a survey of best practices. IEEE/ACM Trans Comput Biol Bioinform. 2020:17(4):1154–1173. 10.1109/TCBB.2019.2893170. [DOI] [PubMed] [Google Scholar]
  13. Cordero  GA, Liu  H, Wimalanathan  K, Weber  R, Quinteros  K, Janzen  FJ. Gene network variation and alternative paths to convergent evolution in turtles. Evol Dev. 2018:20(5):172–185. 10.1111/ede.12264. [DOI] [PubMed] [Google Scholar]
  14. Crawford  DL, Oleksiak  MF. The biological importance of measuring individual variation. J Exp Biol. 2007:210(9):1613–1621. 10.1242/jeb.005454. [DOI] [PubMed] [Google Scholar]
  15. Des Marais  DL, Guerrero  RF, Lasky  JR, Scarpino  SV. Topological features of a gene co-expression network predict patterns of natural diversity in environmental response. Proc R Soc Lond B Biol Sci. 2017:284(1856):20170914. 10.1098/rspb.2017.0914. [DOI] [Google Scholar]
  16. Endler  JA. Multiple-trait coevolution and environmental gradients in guppies. Trends Ecol Evol. 1995:10(1):22–29. 10.1016/S0169-5347(00)88956-9. [DOI] [PubMed] [Google Scholar]
  17. Fischer  EK, Ghalambor  CK, Hoke  KL. Can a network approach resolve how adaptive vs nonadaptive plasticity impacts evolutionary trajectories?  Integr Comp Biol. 2016a:56(5):877–888. 10.1093/icb/icw087. [DOI] [PubMed] [Google Scholar]
  18. Fischer  EK, Ghalambor  CK, Hoke  KL. Plasticity and evolution in correlated suites of traits. J Evol Biol. 2016b:29(5):991–1002. 10.1111/jeb.12839. [DOI] [PubMed] [Google Scholar]
  19. Fischer  EK, Harris  RM, Hofmann  HA, Hoke  KL. Predator exposure alters stress physiology in guppies across timescales. Horm Behav. 2014:65(2):165–172. 10.1016/j.yhbeh.2013.12.010. [DOI] [PubMed] [Google Scholar]
  20. Fischer  EK, Soares  D, Archer  KR, Ghalambor  CK, Hoke  KL. Genetically and environmentally mediated divergence in lateral line morphology in the Trinidadian guppy (Poecilia reticulata). J Exp Biol. 2013:216(Pt 16):3132–3142. 10.1242/jeb.081349. [DOI] [PubMed] [Google Scholar]
  21. Fischer  EK, Song  Y, Hughes  KA, Zhou  W, Hoke  KL. Non-parallel transcriptional divergence during parallel adaptation. Mol Ecol. 2021:30(6):1516–1530. 10.1111/mec.15823. [DOI] [PubMed] [Google Scholar]
  22. Fraser  BA, Künstner  A, Reznick  DN, Dreyer  C, Weigel  D. Population genomics of natural and experimental populations of guppies (Poecilia reticulata). Mol Ecol. 2015:24(2):389–408. 10.1111/mec.13022. [DOI] [PubMed] [Google Scholar]
  23. Friedman  DA, York  RA, Hilliard  AT, Gordon  DM. Gene expression variation in the brains of harvester ant foragers is associated with collective behavior. Commun Biol. 2020:3(1):100. 10.1038/s42003-020-0813-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Gilliam  JF, Fraser  DF, Alkins-Koo  M. Structure of a tropical stream fish community: a role for biotic interactions. Ecology. 1993:74(6):1856–1870. 10.2307/1939943. [DOI] [Google Scholar]
  25. Hahn  MW, Conant  GC, Wagner  A. Molecular evolution in large genetic networks: does connectivity equal constraint?  J Mol Evol. 2004:58(2):203–211. 10.1007/s00239-003-2544-0. [DOI] [PubMed] [Google Scholar]
  26. Hahn  MW, Kern  AD. Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. Mol Biol Evol. 2005:22(4):803–806. 10.1093/molbev/msi072. [DOI] [PubMed] [Google Scholar]
  27. Hämälä  T, Gorton  AJ, Moeller  DA, Tiffin  P. Pleiotropy facilitates local adaptation to distant optima in common ragweed (Ambrosia artemisiifolia). PLoS Genet. 2020:16(3):1–23. 10.1371/journal.pgen.1008707. [DOI] [Google Scholar]
  28. Handelsman  CA, Broder  ED, Dalton  CM, Ruell  EW, Myrick  CA, Reznick  DN, Ghalambor  CK. Predator-induced phenotypic plasticity in metabolism and rate of growth: rapid adaptation to a novel environment. Integr Comp Biol. 2013:53(6):975–988. 10.1093/icb/ict057. [DOI] [PubMed] [Google Scholar]
  29. Handelsman  CA, Ruell  EW, Torres-Dowdall  J, Ghalambor  CK. Phenotypic plasticity changes correlations of traits following experimental introductions of Trinidadian guppies (Poecilia reticulata). Integr Comp Biol. 2014:54(5):794–804. 10.1093/icb/icu112. [DOI] [PubMed] [Google Scholar]
  30. Hanson  D, Hu  J, Hendry  AP, Barrett  RDH. Heritable gene expression differences between lake and stream stickleback include both parallel and antiparallel components. Heredity (Edinb). 2017:119(5):339–348. 10.1038/hdy.2017.50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Harnqvist  Simon  E, Grace  Cooper  A, Jeffares  Daniel  C. Variables influencing differences in sequence conservation in the fission yeast Schizosaccharomyces pombe. J Molec Evol. 2021:89:601–610. 10.1007/s00239-021-10028-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Hoke  KL, Adkins-Regan  E, Bass  AH, Mccune  AR, Wolfner  MF. Co-opting evo-devo concepts for new insights into mechanisms of behavioural diversity. J Exp Biol. 2019:222(8):jeb190058. 10.1242/jeb.190058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Hu  J, Weber  JN, Fuess  LE, Steinel  NC, Bolnick  DI, Wang  M. A spectral framework to map QTLs affecting joint differential networks of gene co-expression. PLoS Comput Biol. 2025:21(4):e1012953. 10.1371/journal.pcbi.1012953. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Huizinga  M, Ghalambor  CK, Reznick  DN. The genetic and environmental basis of adaptive differences in shoaling behaviour among populations of Trinidadian guppies, Poecilia reticulata. J Evol Biol. 2009:22(9):1860–1866. 10.1111/j.1420-9101.2009.01799.x. [DOI] [PubMed] [Google Scholar]
  35. Jacobs  A, Carruthers  M, Yurchenko  A, Gordeeva  NV, Alekseyev  SS, Hooker  O, Leong  JS, Minkley  DR, Rondeau  EB, Koop  BF, et al.  Parallelism in eco-morphology and gene expression despite variable evolutionary and genomic backgrounds in a Holarctic fish. PLoS Genet. 2020:16(4):e1008658. 10.1371/journal.pgen.1008658. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Jeong  H, Mason  SP, Barabási  AL, Oltvai  ZN. Lethality and centrality in protein networks. Nature. 2001:411(6833):41–42. 10.1038/35075138. [DOI] [PubMed] [Google Scholar]
  37. Jin  X, Chan  K, Barnett  I, Ghosh  RP. Two-sample hypothesis testing for large random graphs of unequal size. arXiv, arXiv:2402.11133, preprint: not peer reviewed.
  38. Josephs  EB, Wright  SI, Stinchcombe  JR, Schoen  DJ. The relationship between selection, network connectivity, and regulatory variation within a population of Capsella grandiflora. Genome Biol Evol. 2017:9(4):1099–1109. 10.1093/gbe/evx068. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Jovelin  R, Phillips  PC. Evolutionary rates and centrality in the yeast gene regulatory network. Genome Biol. 2009:10(4):R35. 10.1186/gb-2009-10-4-r35. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Kim  PM, Korbel  JO, Gerstein  MB. Positive selection at the protein network periphery: evaluation in terms of structural constraints and cellular context. Proc Natl Acad Sci U S A. 2007:104(51):20274–20279. 10.1073/pnas.0710183104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Koubkova-Yu  TCT, Chao  JC, Leu  JY. Heterologous Hsp90 promotes phenotypic diversity through network evolution. PLoS Biol. 2018:16(11):e2006450. 10.1371/journal.pbio.2006450. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Kuo  HC, Yao  CT, Liao  BY, Weng  MP, Dong  F, Hsu  YC, Hung  CM. Weak gene–gene interaction facilitates the evolution of gene expression plasticity. BMC Biol. 2023:21(1):1–20. 10.1186/s12915-023-01558-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Langfelder  P, Horvath  S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008:9(1):1–13. 10.1186/1471-2105-9-559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Langfelder  P, Luo  R, Oldham  MC, Horvath  S. Is my network module preserved and reproducible?  PLoS Comput Biol. 2011:7(1):e1001057. 10.1371/journal.pcbi.1001057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Laporte  M, Rogers  SM, Dion-Côté  AM, Normandeau  E, Gagnaire  PA, Dalziel  AC, Chebib  J, Bernatchez  L. RAD-QTL mapping reveals both genome-level parallelism and different genetic architecture underlying the evolution of body shape in lake whitefish (Coregonus clupeaformis) species pairs. G3 (Bethesda). 2015:5(7):1481–1491. 10.1534/g3.115.019067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Li  J, Chen  SX. Two sample tests for high-dimensional covariance matrices. Ann Stat. 2012:40(2):908–940. 10.1214/12-AOS993. [DOI] [Google Scholar]
  47. Li  S, Cai  TT, Li  H. Transfer learning in large-scale Gaussian graphical models with false discovery rate control. J Am Stat Assoc. 2023:118(543):2171–2183. 10.1080/01621459.2022.2044333. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Love  MI, Huber  W, Anders  S. Moderated estimation of fold change and dispersion for RNA-Seq data with DESeq2. Genome Biol. 2014:15(12):1–21. 10.1186/s13059-014-0550-8. [DOI] [Google Scholar]
  49. Luisi  P, Alvarez-Ponce  D, Pybus  M, Fares  MA, Bertranpetit  J, Laayouni  H. Recent positive selection has acted on genes encoding proteins with more interactions within the whole human interactome. Genome Biol Evol. 2015:7(4):1141–1154. 10.1093/gbe/evv055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Lynch  M. The evolution of genetic networks by non-adaptive processes. Nat Rev Genet. 2007:8(10):803–813. 10.1038/nrg2192. [DOI] [PubMed] [Google Scholar]
  51. Magurran  AE. Evolutionary ecology: the Trinidadian guppy. Oxford: Oxford University Press; 2005. [Google Scholar]
  52. Mähler  N, Wang  J, Terebieniec  BK, Ingvarsson  PK, Street  NR, Hvidsten  TR. Gene co-expression network connectivity is an important determinant of selective constraint. PLoS Genet. 2017:13(4):1–33. 10.1371/journal.pgen.1006402. [DOI] [Google Scholar]
  53. Masalia  RR, Bewick  AJ, Burke  JM. Connectivity in gene coexpression networks negatively correlates with rates of molecular evolution in flowering plants. PLoS One. 2017:12(7):1–10. 10.1371/journal.pone.0182289. [DOI] [Google Scholar]
  54. Maugis  P-AG, Olhede  SC, Wolfe  PJ. Topology reveals universal features for network comparison. arXiv, arXiv:1705.05677, preprint: not peer reviewed.
  55. Mcguigan  K, Sgrò  CM. Evolutionary consequences of cryptic genetic variation. Trends Ecol Evol. 2009:24(6):305–311. 10.1016/j.tree.2009.02.001. [DOI] [PubMed] [Google Scholar]
  56. Paaby  AB, Rockman  MV. Cryptic genetic variation: evolution's hidden substrate. Nat Rev Genet. 2014:15(4):247–258. 10.1038/nrg3688. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Pavličev  M, Cheverud  JM. Constraints evolve: context dependency of gene effects allows evolution of pleiotropy. Annu Rev Ecol Evol Syst. 2015:46(1):413–434. 10.1146/annurev-ecolsys-120213-091721. [DOI] [Google Scholar]
  58. Pavlicev  M, Wagner  GP. A model of developmental evolution: selection, pleiotropy and compensation. Trends Ecol Evol. 2012:27(6):316–322. 10.1016/j.tree.2012.01.016. [DOI] [PubMed] [Google Scholar]
  59. Qi  M, Li  T, Zhou  W. Multivariate inference of network moments by subsampling. arXiv, arXiv:2409.01599, preprint: not peer reviewed.
  60. Qiu  T, Xu  W, Zhu  L. Two-sample test in high dimensions through random selection. Comput Stat Data Anal. 2021:160:e107218. 10.1016/j.csda.2021.107218. [DOI] [Google Scholar]
  61. Rennison  DJ, Peichel  CL. Pleiotropy facilitates parallel adaptation in sticklebacks. Mol Ecol. 2022:31(5):1476–1486. 10.1111/mec.16335. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Reznick  D, Butler Iv  MJ, Rodd  H. Life-history evolution in guppies. VII. The comparative ecology of high- and low-predation environments. Am Nat. 2001:157(2):126–140. 10.1086/318627. [DOI] [PubMed] [Google Scholar]
  63. Reznick  DA, Bryga  H, Endler  JA. Experimentally induced life-history evolution in a natural population. Nature. 1990:346(6282):357–359. 10.1038/346357a0. [DOI] [Google Scholar]
  64. Reznick  DN. Life history evolution in guppies (Poecilia reticulata): guppies as a model for studying the evolutionary biology of aging. Exp Gerontol. 1997:32(3):245–258. 10.1016/S0531-5565(96)00129-5. [DOI] [PubMed] [Google Scholar]
  65. Ruell  EW, Handelsman  CA, Hawkins  CL, Sofaer  HR, Ghalambor  CK, Angeloni  L. Fear, food and sexual ornamentation: plasticity of colour development in Trinidadian guppies. Proc R Soc Lond B Biol Sci. 2013:280(1758):20122019–20122019. 10.1098/rspb.2012.2019. [DOI] [Google Scholar]
  66. Shao  M, Xia  D, Zhang  Y, Wu  Q, Chen  S. Higher-order accurate two-sample network inference and network hashing. J Am Stat Assoc. 2025:00(0):1–16. 10.1080/01621459.2025.2520459. [DOI] [Google Scholar]
  67. Shojaie  A. Differential network analysis: a statistical perspective. WIRES Compt Stat. 2021:13(2):e1508. 10.1002/wics.1508. [DOI] [Google Scholar]
  68. Storey  JD. A direct approach to false discovery rates. J R Stat Soc Series B Stat Methodol. 2002:64(3):479–498. 10.1111/1467-9868.00346. [DOI] [Google Scholar]
  69. Strimmer  K. fdrtool: a versatile R package for estimating local and tail area-based false discovery rates. Bioinformatics. 2008:24(12):1461–1462. 10.1093/bioinformatics/btn209. [DOI] [PubMed] [Google Scholar]
  70. Tang  M, Athreya  A, Sussman  DL, Lyzinski  V, Park  Y, Priebe  CE. A semiparametric two-sample hypothesis testing problem for random graphs. J Comput Graph Stat. 2017:26(2):344–354. 10.1080/10618600.2016.1193505. [DOI] [Google Scholar]
  71. Tommasini  D, Fogel  BL. multiWGCNA: an R package for deep mining gene co-expression networks in multi-trait expression data. BMC Bioinformatics. 2023:24(1):1–15. 10.1186/s12859-023-05233-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Torres-Dowdal  J, Handelsman  CA, Reznick  DN, Ghalambor  CK. Local adaptation and the evolution of phenotypic plasticity in Trinidadian guppies (Poecilia reticulata). Evolution. 2012:66(11):3432–3443. 10.1111/j.1558-5646.2012.01694.x. [DOI] [PubMed] [Google Scholar]
  73. Torres Dowdall  J, Handelsman  CA, Ruell  EW, Auer  SK, Reznick  DN, Ghalambor  CK. Fine-scale local adaptation in life histories along a continuous environmental gradient in Trinidadian guppies. Funct Ecol. 2012:26(3):616–627. 10.1111/j.1365-2435.2012.01980.x. [DOI] [Google Scholar]
  74. Wang  D, Wang  J, Jiang  Y, Liang  Y, Xu  D. BFDCA: a comprehensive tool of using Bayes factor for differential co-expression analysis. J Mol Biol. 2017:429(3):446–453. 10.1016/j.jmb.2016.10.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Wang  M, Fischer  J, Song  YS. Three-way clustering of multi-tissue multi-individual gene expression data using semi-nonnegative tensor decomposition. Ann Appl Stat. 2019:13(2):1103–1127. 10.1214/18-AOAS1228. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Wang  Z, Liao  B-Y, Zhang  J. Genomic patterns of pleiotropy and the evolution of complexity. Proc Natl Acad Sci U S A. 2010:107(42):18034–18039. 10.1073/pnas.1004666107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Warnefors  M, Kaessmann  H. Evolution of the correlation between expression divergence and protein divergence in mammals. Genome Biol Evol. 2013:5(7):1324–1335. 10.1093/gbe/evt093. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. West-Eberhard  MJ. Developmental plasticity and evolution. New York: Oxford University Press; 2003. [Google Scholar]
  79. Whitehead  A, Crawford  DL. Neutral and adaptive variation in gene expression. Proc Natl Acad Sci U S A. 2006:103(14):5425–5430. 10.1073/pnas.0507648103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Willing  EM, Bentzen  P, Van Oosterhout  C, Hoffmann  M, Cable  J, Breden  F, Weigel  D, Dreyer  C.  Genome-wide single nucleotide polymorphisms reveal population history and adaptive divergence in wild guppies. Mol Ecol. 2010:19(5):968–984. 10.1111/j.1365-294X.2010.04528.x. [DOI] [PubMed] [Google Scholar]
  81. Wu  T-L, Li  P. Projected tests for high-dimensional covariance matrices. J Stat Plan Inference. 2020:207:73–85. 10.1016/j.jspi.2019.11.003. [DOI] [Google Scholar]
  82. Yu  X, Li  D, Xue  L. Fisher’s combined probability test for high-dimensional covariance matrices. J Am Stat Assoc. 2022:119(545):511–524. 10.1080/01621459.2022.2126781. [DOI] [Google Scholar]
  83. Zhao  SD, Cai  TT, Li  H. Direct estimation of differential networks. Biometrika. 2014:101(2):253–268. 10.1093/biomet/asu009. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

msaf194_Supplementary_Data

Data Availability Statement

Raw sequencing reads are available through the NCBI SRA repository (PRJNA601479). R code for statistical analyses are available on GitHub (https://github.com/EnigmaSong/GeneFlexibilityStudy).


Articles from Molecular Biology and Evolution are provided here courtesy of Oxford University Press

RESOURCES