Skip to main content
Plant Communications logoLink to Plant Communications
. 2025 May 13;6(7):101376. doi: 10.1016/j.xplc.2025.101376

A map of integrated cis-regulatory elements enhances gene-regulatory analysis in maize

Jasper Staut 1,2,4, Nicolás Manosalva Pérez 1,2,4, Andrés Matres Ferrando 1,2, Indeewari Dissanayake 1,2, Klaas Vandepoele 1,2,3,
PMCID: PMC12281299  PMID: 40369872

Abstract

cis-regulatory elements (CREs) are non-coding DNA sequences that modulate gene expression. Their identification is essential to the study of transcriptional regulation of genes that control key traits involved in plant growth and development. CREs are also critical for the delineation of gene-regulatory networks (GRNs), which map regulatory interactions between transcription factors (TFs) and target genes. In maize, CREs have been profiled using various computational and experimental methods, but the extent to which these approaches complement each other when identifying functional CREs remains unclear. Here, we report the data-driven integration of multiple maize CRE-profiling methods to optimize the capture of experimentally confirmed TF-binding sites, resulting in a map of integrated CREs (iCREs) with improved completeness and precision. We combined these iCREs with diverse gene expression datasets generated under drought conditions to perform motif enrichment analyses and infer drought-specific GRNs. Mining these organ-specific GRNs identified both known and novel candidate regulators of maize drought responses and revealed significant overlap with drought-associated eQTL regulatory interactions. Furthermore, analysis of transposable elements (TEs) overlapping with iCREs identified several TE superfamilies with epigenetic features characteristic of regulatory DNA that potentially mediate specific TF–target gene interactions. Overall, our study showcases the utility of multi-omics data integration to generate a high-quality collection of CREs and illustrates their potential to improve the characterization of gene regulation in the complex maize genome.

Key words: maize, cis-regulatory elements, epigenetics, gene-regulatory networks, transposable elements


This study reports a map of maize cis-regulatory elements (CREs) constructed through the integration of multiple CRE profiling methods. By combining these CREs with gene-expression data, the authors inferred drought-specific gene-regulatory networks, identifying key regulators of maize drought response. Additionally, the CRE map highlights the contribution of transposable elements to gene regulation.

Introduction

The characterization of functional genomic elements, particularly cis-regulatory elements (CREs), is vital to understanding the regulation of complex gene expression programs that coordinate development and environmental adaptation in plants (Marand et al., 2023). CREs consist of short noncoding DNA sequences that are bound by transcription factors (TFs) to regulate gene expression. These sequences are also called TF-binding sites (TFBSs) or motifs. The comprehensive identification of CREs is essential to the study of gene regulation and the identification of gene-regulatory networks (GRNs), which map regulatory interactions between TFs and their target genes. GRNs have been widely used to identify molecular regulators of signaling pathways that control plant growth and stress responses (Sullivan et al., 2014; Chen et al., 2018; Gaudinier et al., 2018; Maher et al., 2018; Reynoso et al., 2019; Zander et al., 2020; Zhou et al., 2020; Clark et al., 2021; De Clercq et al., 2021; Kajala et al., 2021; Depuydt et al., 2023). In maize, the second-most-produced crop worldwide, CRE characterization and GRN construction are challenging due to its large, repeat-rich genome. The maize genome contains long intergenic regions, exhibits long-range regulatory interactions, and consists of 80% transposable elements (TEs) (Jiao et al., 2017; Li et al., 2019; Lu et al., 2019; Ricci et al., 2019). During maize evolution, TE proliferation shaped genomic structure and function; although most TEs are silenced and inactive, they can still influence gene expression when they carry CREs and insert near genes (Marand et al., 2023). In addition, TE insertions can disrupt existing CREs or alter the local epigenetic context through the spreading of repressive chromatin marks (Hirsch and Springer, 2017; Schmitz et al., 2022; Marand et al., 2023).

Both experimental and computational methods can be used to identify CREs at the genome-wide scale. Chromatin immunoprecipitation sequencing (ChIP-seq) profiles the in vivo binding of a specific TF in a specific context, making it a low-throughput method. In contrast, characterization of accessible chromatin regions (ACRs), a TF-independent method for the identification of putative CREs, can be achieved using various techniques, such as the assay for transposase-accessible chromatin sequencing (Buenrostro et al., 2015), DNase I hypersensitive-site sequencing (Boyle et al., 2008), micrococcal nuclease sequencing (Yuan et al., 2005), and MNase-defined cistrome-occupancy analysis (Savadel et al., 2021). These methods support CRE identification because regions depleted of nucleosomes allow TFs to bind to their corresponding binding sites, suggesting functional regulatory regions in specific in vivo contexts (Noll, 1974; Oudet et al., 1975; Weintraub and Groudine, 1976; Wang et al., 2011). DNA methylation is also involved in gene regulation: hypomethylated regions tend to overlap with ACRs (Eli et al., 2016; Oka et al., 2017; Ricci et al., 2019), whereas hypermethylation is usually observed in silenced genomic regions (Soppe et al., 2002). Unmethylated regions (UMRs), identified using deep whole-genome bisulfite sequencing, are often located near expressed genes and overlap with ACRs across multiple tissues. Furthermore, DNA methylation patterns in plants exhibit low variability during vegetative development (Eichten et al., 2013; Kawakatsu et al., 2016) and in response to environmental changes (Eichten and Springer, 2015; Crisp et al., 2017; Ganguly et al., 2017). As a result, UMRs offer potential for the identification of functional CREs across multiple organs, tissues, developmental stages, and environments, allowing a more complete set of regulatory elements to be profiled compared to ACR-based methods (Crisp et al., 2020).

Computational methods that characterize putative CREs primarily focus on genomic conservation profiling to identify conserved non-coding sequences (CNSs). CREs often show high sequence conservation due to purifying selection that acts on their regulatory functions (Kimura, 1991; Cooper and Brown, 2008; Haudry et al., 2013). In plants, including maize, approaches used to identify CNSs include FunTFBS (Tian et al., 2020), msa_pipeline (Wu et al., 2022), BLSSpeller (De Witte et al., 2015), and the method developed by Song et al. (2021). FunTFBS and msa_pipeline assess genome-wide conservation using chained pairwise genome alignments and evolutionary models to calculate conservation scores and identify conserved elements. BLSSpeller uses k-mers to assess the conservation of DNA motifs in the promoters of orthologous and paralogous genes, accounting for relative phylogenetic distances. The method of Song et al. (2021) uses pairwise alignments to identify extended seeds, which are locally conserved sequences designated as CNSs. The Conservatory project (Hendelman et al., 2021; https://conservatorycns.com/) uses cis-regulatory sequences and gene-based microsynteny to generate multiple alignments of upstream and downstream regulatory regions in closely related species. It then reconstructs the ancestral sequence of each CNS and searches for it in syntenic regions of more distantly related genomes.

In contrast to TF ChIP-seq profiling, assessments of chromatin accessibility, DNA methylation, and genomic conservation provide a more general, context-independent strategy to delineate a comprehensive map of CREs. Accurate CRE identification improves the characterization of regulatory DNA near individual genes and enables the development of motif-based tools to predict GRNs, thereby supporting gene-regulation analysis (Wilkins et al., 2016; Kulkarni et al., 2018; Ding et al., 2021; Ferrari et al., 2022; Manosalva Pérez et al., 2024).

Although there are diverse methods for the characterization of putative CREs in plants, it is currently unclear which is most effective at identifying TFBSs and regulatory interactions. Here, we present an evaluation of five CNS detection methods in maize and their comparison and integration with UMRs and an ACR compendium. We benchmarked these three CRE identification strategies to assess their complementarity and generated a map of integrated CREs (iCREs). We demonstrated the value of iCREs in identifying functional TFBSs by inferring iCRE-based GRNs that are specific to maize drought responses across multiple organs. These GRNs enabled the identification of diverse drought-related TFs and provided insights into the downstream processes they control as well as their organ specificity. In addition, we identified specific TE superfamilies that are enriched in iCREs, display chromatin signatures of regulatory DNA, exhibit overrepresentation of specific TFBSs, and influence gene expression.

Results

Analysis of complementarity between CREs identified using computational and experimental approaches

Although several CRE-profiling methods exist, their complementarity in predicting functional TFBSs in plants has not been evaluated. We analyzed five CNS detection methods that have been used in plants (BLSSpeller, msa_pipeline, FunTFBS, the method of Song et al. [2021], and the method of the Conservatory project), along with two experimental methods used to profile epigenetic landscapes (ACRs and UMRs), as candidate strategies for the detection of putative CREs. To evaluate and compare their detection of regulatory DNA, particularly TFBSs, we used a ChIP-seq gold standard that comprises binding events for 104 TFs in maize mesophyll cells (Tu et al., 2020). Because chromatin accessibility indicates regions that can be bound by TFs, we treated the ACR and UMR datasets included in our benchmark as alternative silver standards to assess the robustness of our findings. To construct a compendium of ACRs, we merged 13 high-quality datasets across diverse tissues (leaf, root, ear, tassel, husk, inner stem, above-ground seedling, and axillary bud; see Methods; Supplemental Figure 1). As controls, we included repetitive DNA and TEs, which are known to be depleted of CREs (Marand et al., 2023).

We found that the predictions from all tested methods were significantly enriched (p < 0.05) for the binding sites identified in the ChIP-seq dataset, whereas repeats and TEs were depleted (Supplemental Table 1), demonstrating that each method could identify TF-binding sites. We calculated the precision, recall, and F1 score of each method using base-pair intersection with the ChIP-seq dataset, which served as the gold standard. Precision measures the correctness of the predictions, whereas recall measures completeness, defined as the proportion of regions in the gold standard that were identified. F1, the harmonic mean of precision and recall, balances these two metrics. We observed notable differences in performance among the methods, particularly the CNS detection methods (Figure 1A; Supplemental Table 1). The method of Song et al. (2021) achieved the highest recall among CNS detection methods (62.9%), and its precision (13.3%) was only surpassed by the Conservatory method (22.0%). To assess the robustness of these findings, we used the ACR compendium and the UMR dataset as alternative silver standards, which produced similar patterns when predicting ACRs or UMRs (Supplemental Figure 2). Overall, the method of Song et al. (2021) was the best-performing CNS detection method as quantified by F1 score (22.0% for the ChIP-seq gold standard). However, both ACRs and UMRs achieved higher F1 than any CNS detection method (Figure 1A; Supplemental Table 1). Between these two, UMRs were more effective at predicting TF-binding sites from the gold standard (1.7% higher precision and 15.9% higher recall than ACRs). Evaluation using the ACR silver standard (with ACRs excluded from the evaluation) further confirmed that UMRs outperformed all CNS detection methods (Supplemental Figure 2A).

Figure 1.

Figure 1

Benchmarking, complementarity assessment, and integration of CNS detection and experimental CRE-profiling methods.

(A) Precision–recall plot of individual CNS detection and experimental CRE-profiling methods, using a ChIP-seq dataset as the gold standard. CNS detection methods are indicated with circles, experimental CRE-profiling methods with triangles, and control datasets with squares.

(B) Precision–recall plot (using ChIP-seq as the gold standard) of individual CNS detection methods and ensembles constructed by intersecting these methods. The ensemble with the highest F1 score is shown in green.

(C) UpSet plot of ACRs, UMRs, and CNSs with performance metrics for each subset. The sizes of the full datasets are shown on the left, and the sizes of the unique subsets are shown at the top. Precision, recall, and F1 score for each subset were calculated using the ChIP-seq gold standard and are shown in a heatmap below the size bars. Subsets are sorted by precision, with the most precise subset shown on the left. Precision, recall, and F1 score were also calculated for different ensemble sets, created by starting from the subset with the highest precision (furthest left) and progressively adding individual subsets with the next-highest precision until all regions were included (furthest right). The combined subsets resulting in the “max F1 iCREs” and “all iCREs” are labeled “max F1” and “all” respectively.

(D) The genomic feature distribution of the “max F1 iCREs” and “all iCREs” datasets. Total coverage for each genomic feature is given in megabases (Mb).

Based on our results indicating that both the CNS and experimental datasets identify functional regulatory DNA, we next assessed their complementarity, starting with the CNS datasets. We quantified the base-pair overlap among the five CNS datasets and visualized them using an UpSet plot, with each column representing a unique (i.e., not overlapping with another column) subset of genomic regions (Supplemental Figure 3). We also calculated the precision, recall, and F1 score for each unique subset represented in the UpSet plot. As expected, the intersection of all five CNS detection methods exhibited the highest precision. We then evaluated whether an ensemble of CNS detection methods could outperform the best individual method. Starting with the unique subset of genomic regions with the highest precision, we iteratively added the subset with the next-highest precision until all subsets were included. Evaluation of precision, recall, and F1 score at each step showed that the ensemble approach yielded a CNS dataset with much higher precision (47.0%) than the best individual CNS dataset (22.0%), though with reduced recall (Figure 1B; Supplemental Figure 3). However, when comparing the CNSs of Song et al. (2021) to an ensemble with similar recall, they had similar performance (ensemble recall was 0.27% higher and precision was 0.00001% higher). The best CNS ensemble showed only a marginal improvement in F1 score (0.38%) over the best individual method, demonstrating that the ensemble does not substantially outperform the method of Song et al. (2021). Therefore, for simplicity, we did not continue with a CNS ensemble approach. Instead, we used the CNSs of Song et al. (2021) to assess the complementarity among CNSs, ACRs, and UMRs to guide their integration into a map of putative CREs.

To integrate CNSs, ACRs, and UMRs in a way that maximizes agreement with the ChIP-seq gold standard and thereby optimally captures functional CREs, we performed a complementarity analysis using the same approach as for the CNS detection methods. Unique subsets of genomic regions were defined by overlapping and subtracting the subsets identified by the different methods, and performance was calculated for each subset (Figure 1C). Each of the three methods contributed a substantial fraction of unique regions. Regions unique to UMRs showed better agreement with ChIP-seq-confirmed binding sites than those unique to ACRs or CNSs, or those where ACRs and CNSs overlap but do not coincide with a UMR. We also observed that combining methylation data with CNSs or ACRs further improves precision (up to 37%, compared with 20% when considering all UMRs; Figure 1C). This result indicates that genomic conservation, chromatin accessibility, and DNA methylation provide complementary information for CRE prediction. Finally, we integrated the CNSs, ACRs, and UMRs into a set of iCREs. As described above, we combined the unique subsets represented in the UpSet plot into ensembles by progressively adding the subset with the next-highest precision, and selected the ensemble with the highest F1 score (Figure 1C) based on the ChIP-seq gold standard. This ensemble, which we refer to as the “max F1 iCREs,” corresponds to the intersection of the UMRs and ACRs. We then constructed a second ensemble consisting of the regions identified by all three methods, which we refer to as “all iCREs.” Figure 1D shows the distribution of the ensembles across different genomic regions and features. As expected, the “max F1 iCREs” dataset exhibits lower genomic coverage (48 Mb) and a higher proportion of regions located in UTRs or proximal regions (19% UTR and 28% proximal), whereas the “all iCREs” dataset covers more of the genome (184 Mb) and has a lower proportion in those regions (9% UTR and 20% proximal). In summary, DNA methylation analysis shows strong agreement with ChIP-seq-confirmed TFBSs and can be complemented by chromatin-accessibility and genome-conservation information. This complementarity enables the construction of both a high-confidence (max F1 iCREs) and a more comprehensive (all iCREs) map of putative maize CREs.

Inference of drought-responsive GRNs using iCREs and an organ-specific gene expression atlas

iCREs represent a comprehensive set of maize regulatory regions and have the potential to improve GRN inference. Therefore, we designed a framework for iCRE-based motif enrichment and GRN inference starting from a set of functionally related genes, using the MINI-AC tool (Manosalva Pérez et al., 2024) (Figure 2A).

Figure 2.

Figure 2

iCRE-based GRN prediction and selection of drought-responsive gene sets.

(A) Overview of the iCRE-based GRN inference framework. iCREs associated with a set of functionally-related or co-expressed genes were submitted to MINI-AC for motif enrichment and GRN inference.

(B) Left: inverse cumulative distribution of the number of upregulated and downregulated genes (y axis) shared by a specific number of contrasts (x axis). Right: activation and repression consistency score distributions among the core drought-responsive genes.

(C) UpSet plot showing the numbers of shared drought-responsive genes across tissues, sorted by intersection size. Only intersections containing at least 25 genes are shown.

(D) Clustered heatmaps showing log-transformed fold changes in gene expression for the core (left) and tissue-specific (right) drought-responsive gene sets.

The frequency and severity of droughts are expected to increase given the current climate emergency, ultimately affecting maize yield. To identify candidate drought regulators and the processes they control, we selected and curated data from 20 maize drought-related RNA sequencing (RNA-seq) studies to construct a gene expression atlas of drought-responsive differentially expressed genes (DEGs; Supplemental Table 2; Supplemental Figure 4, Methods). These studies included samples from six organs: leaf, root, seminal root, primary root, kernel, and ear. We then identified marker genes for each organ that respond to drought exclusively in that organ, as well as core drought-responsive genes that consistently respond to drought across various experimental conditions, samples, and genotypes.

Core drought-responsive genes were identified by counting the number of contrasts (i.e., comparisons between control and drought samples) in which each drought-responsive gene was differentially expressed (DE). We then selected genes that were upregulated or downregulated in at least one-third of all contrasts in which any DEG was detected. This yielded 568 “core UP” genes across 15 contrasts and 1702 “core DOWN” genes across 10 contrasts (Figure 2B). Given the diversity of the studies, some genes were repressed (i.e., downregulated) in some contrasts and activated (i.e., upregulated) in others. Therefore, we defined activation and repression consistency as the proportion of contrasts in which a gene was upregulated or downregulated, respectively, out of all contrasts in which it was DE (see Methods). Although we observed high consistency scores overall, they were greater for downregulated genes (median = 0.88) than for upregulated ones (median = 0.75; p < 2.2 × 10−16, one-sided Mann–Whitney U test). This pattern likely stems from the leaf samples, which accounted for the most contrasts and were the only organ with low activation consistency scores (Figure 2B and Supplemental Figure 5). The higher repression consistency scores may result from the drought-induced repression of photosynthetic genes (Berrío et al., 2022).

Organ-specific drought-responsive genes were defined as DEGs present in more than half of the contrasts of a given organ (Figure 2C; kernel was excluded because it had a large number of DEGs and only one contrast). We evaluated the overlap between organ-specific activated and repressed genes to identify unique drought-responsive marker genes. In most organs, the majority of organ-specific DEGs were unique (Figure 2C and Supplemental Figure 6). For instance, 87% and 83% of repressed DEGs in leaf and root tissues, respectively, were unique. Clustering of core drought-responsive genes by their expression log fold change confirmed a ubiquitous drought response (repression and activation across many contrasts), whereas clustering of marker drought-responsive genes supported their organ specificity (Figure 2D).

Next, for each drought-responsive gene set, we identified the associated iCREs and performed motif enrichment and GRN inference using MINI-AC (GRNs are provided in Supplemental Dataset 1; summary statistics are in Supplemental Table 3). GRN construction involved identifying motifs overrepresented in the iCREs associated with a gene set then linking them to the TFs that bind those motifs (MINI-AC typically performs this analysis using a set of ACRs). We evaluated whether motif enrichment using iCREs (iCRE-based enrichment) provided an advantage over using the adjacent non-coding regions and introns of drought-responsive genes (default promoter-based enrichment; see Methods for iCRE-based GRN inference). We compared these two approaches by measuring how well MINI-AC’s enrichment analysis predicts the motifs bound by a gold standard set of TFs, defined as the DE TFs in each drought-responsive gene set. In other words, we evaluated whether the motifs associated with iCREs of drought-responsive DEGs were significantly enriched for those bound by drought-responsive TFs. In addition, we compared the “all iCREs” and “max F1 iCREs” sets (Figure 3A; Supplemental Table 4). Our results showed that iCRE-based motif enrichment outperformed promoter-based enrichment in nine of the 12 gene sets for both metrics tested: F1 score and area under the precision-recall curve (AUPRC, which measures the performance of top-scoring predictions; see Methods for details on iCRE-based GRN inference). The iCRE-based approach achieved a median F1 score 10% higher and a median AUPRC 6% higher than the promoter-based approach. Although the “max F1” and “all iCRE” sets showed similar performance, the latter was superior for most drought-responsive gene sets; the median AUPRCs were 0.049 and 0.065, respectively, and the median F1 scores were 0.073 and 0.1, respectively.

Figure 3.

Figure 3

Benchmarking and analysis of iCRE-based GRNs for drought-responsive genes.

(A) Comparison of the performance of the promoter-based and iCRE-based GRN inference frameworks, using two iCRE sets (“max F1” and “all”). The two performance metrics (F1 and AUPRC scores) measure the ability of each method to predict differentially expressed (DE) regulators given a set of tissue-specific drought-responsive genes (y axis). Performance differences are shown relative to the promoter-based results as the baseline (x axis).

(B) Clustered heatmap showing the motif-enrichment ranks of DE transcription factors (TFs) (y axis) based on sets of iCREs associated with tissue-specific (x axis) upregulated drought-responsive genes. Row annotations indicate activation and repression consistency score bins and the TF families. Only TFs with an expression consistency score greater than 0.513 are shown.

(C) Visualization of the predicted regulation of aaap53 by the TF ereb131. The “Genes” track shows annotated genes: exons are shown as rectangles (with coding exons having greater height), and introns are shown as lines. The “TF-binding motif” track shows the location of the ereb131 binding site M06676. The “eQTL lead SNP” track shows the eQTL lead SNP linked to the expression of the trait-associated gene aaap53. The “iCREs” track shows UMR, ACR, and CNS regions from the iCRE dataset.

To further assess the advantages of using iCREs for GRN inference and to validate our drought-responsive GRNs, we used a set of publicly available drought-specific eQTLs that link single-nucleotide polymorphism (SNP) variants to differential expression of trait-associated genes (Liu et al., 2020; see Methods). Using the lead SNPs of these drought-specific eQTLs (n = 19 154), we found a 22-fold enrichment of SNPs within the non-coding genome for the “all iCREs” set (p < 0.001), indicating a strong tendency for these SNPs to be located in iCREs. Furthermore, the “all iCREs” set removes 90% of genome-wide motif occurrences identified by MINI-AC, but only 25% of the drought-specific eQTL SNPs. This validates the assumption that using iCREs primarily removes false-positive motif matches, thereby improving the quality of the resulting GRNs.

To evaluate the drought-responsive GRNs against the experimental drought eQTL data, we constructed an eQTL-based GRN. We used the expression trait genes as target genes and linked them to a TF if the lead SNP was located within a binding motif of that TF. We then compared the number of overlapping edges between the eQTL-based GRN and the core drought-responsive GRN generated by MINI-AC (summary statistics in Supplemental Table 3) against a set of background GRNs constructed by shuffling the edges of the MINI-AC GRN. This analysis showed that the promoter-based MINI-AC GRN had an 8% higher overlap with the eQTL GRN (35 870 edges) than the median overlap with the background GRNs (33 145 edges). This difference increased to 20% (13 491 observed vs. 11 292 background) for the “all iCREs” set and 31% (8287 observed vs. 6382 background) for the max F1 iCREs set (p < 0.001 for all overlaps). These results further support the conclusion that the iCRE-driven reduction in false-positive motif matches improves the quality of the drought-responsive GRNs.

We then analyzed the drought-responsive GRNs using the “all iCREs” set and the marker gene sets comprising organ-specific drought DEGs. Using MINI-AC, motifs were scored based on their overrepresentation in the iCREs of the DEGs to calculate a motif-enrichment rank. For each organ, evaluating these ranks alongside the activation and repression consistency allowed us to identify specific drought stress regulators (Figure 3B and Supplemental Figure 7). We observed high expression and motif-enrichment specificity for WRKY TFs in root cell types, especially in seminal root (WRKY57, WRKY118, WRKY108, and WRKY49). Several WRKY TFs have been characterized for their roles in maize drought response (Wang et al., 2018; Zhao et al., 2021) and have been associated with increased lateral root formation (Gulzar et al., 2021). In leaf tissue, we observed high expression specificity for Trihelix TFs (Trihelix19 and Trihelix12) and a low motif-enrichment rank that was not leaf-specific. The Arabidopsis ortholog of these TFs, GT1, has been characterized as a repressor of cell growth (Breuer et al., 2012; Caro et al., 2012) and a regulator of drought tolerance (Yoo et al., 2010), consistent with the growth arrest observed under drought stress (Dubois and Inzé, 2020). bZIP95 also showed high expression and motif-enrichment specificity in leaf tissue, suggesting an important role in the leaf drought response (Figure 3B). In leaf tissue, we found an expected repression of photosynthesis- and carbon-metabolism-related genes under the control of repressed TFs (Supplemental Figure 8; Supplemental Dataset 2). Among these, we identified Arabidopsis thaliana orthologs (PIL6, MYBH, and CRF5) that have previously been characterized in light signaling and cell expansion (Fujimori et al., 2004; Lu et al., 2014; Raines et al., 2016). Conversely, Trihelix19, Trihelix12, and bZIP95 control genes involved in the responses to abscisic acid and water deprivation, further supporting their potential role in maize leaf drought response. In Figure 3C, we present an example locus for one of the predicted drought regulators shown in Figure 3B. The TF ereb131 is predicted by MINI-AC to regulate aaap53, a target gene with differential expression under drought stress. As shown in Figure 3C, a binding motif for ereb131 is located in the UTR of aaap53, within an iCRE identified by UMR, ACR, and CNS. Furthermore, the lead SNP of an eQTL associated with aaap53 expression is located within this binding motif, which provides experimental support for the predicted interaction. The ereb131 TF also has an A. thaliana ortholog, RAP2.6L, which has a role in drought response (Krishnaswamy et al., 2011), making this maize TF an interesting candidate for further study. Overall, these results illustrate the potential of iCREs to infer context-specific GRNs and identify candidate regulators.

iCRE-guided characterization of the regulatory role of different types of TEs

Although numerous examples of TEs influencing gene expression have been reported in plants (Hirsch and Springer, 2017; Noshay et al., 2021; Deneweth et al., 2022), the effect of specific TE types on regulatory elements is less clear. Therefore, we identified a subset of TEs fully contained within the iCREs and examined their potential regulatory roles across superfamilies. In the maize genome, TE superfamilies and repetitive elements are classified according to sequence similarity and transposition specificity (Wicker et al., 2007) (see Methods for TE analysis). We assessed whether specific TE superfamilies or repetitive elements were significantly overrepresented in the iCREs. Although the full set of TEs was depleted in the iCREs, the Tc1/Mariner, PIF/Harbinger, and hAT superfamilies—members of the terminal inverted repeat (TIR) TE order—showed high enrichment within iCREs (3.5-, 2.9-, and 2.3-fold, respectively; Figure 4A). In addition, the Helitron superfamily and two long interspersed nuclear element (LINE) superfamilies showed 1.4- to 2.2-fold enrichment. Using non-coding, non-TE genomic regions as controls confirmed that the iCREs are depleted of TEs except for these six superfamilies. In addition, a Gene Ontology (GO) enrichment analysis revealed that the TEs from the TIR superfamily located within iCREs are most significantly associated with specific groups of functionally related genes (Figure 4B). Some of the most enriched GO terms were associated with responses to biotic and abiotic stresses such as salt, cold, bacteria, and fungi (q values between 5.5 × 10−3 and 2.4 × 10−8; enrichment folds between 1.3 and 1.4) along with developmental functions such as regulation of flower development and chloroplast organization. Although genes associated with other TE superfamilies from the LTR and LINE orders also yielded enrichment for certain GO terms, mostly related to carbohydrate metabolism (Supplemental Table 5), Figure 4B depicts only the most significantly enriched GO terms (those with q value <0.008 and >100 associated genes).

Figure 4.

Figure 4

Analysis of the regulatory role of TEs in maize using iCREs.

(A) Bar plot showing the fold enrichment (x axis) and q values of TE superfamilies and repetitive element types (y axis) in the iCREs, calculated relative to an empirical background distribution of genomic regions (see Methods for enrichment statistics).

(B) Clustered heatmap of −log10(q values) for Gene Ontology (GO) enrichment (y axis), with q values calculated using a hypergeometric test, for genes associated with TEs from three terminal inverted repeat (TIR) superfamilies located within iCREs (x axis). Row annotations indicate the number of TE-associated genes in the corresponding GO category. The heatmap includes only transcription factor (TF)–GO pairs with a q value <0.008 and at least 100 TE-associated genes in the GO category.

(C) Chromatin state enrichment (y axis) of different sets of genomic regions (x axis), including all TEs and repetitive elements, iCREs, and TEs within and outside iCREs, calculated relative to an empirical background distribution (see Methods for enrichment statistics). The adjacent tables indicate the chromatin marks and genomic locations associated with each chromatin state.

(D) Boxplots of gene expression distributions in transcripts per million (y axis) for genes associated with TEs within and outside iCREs, binned by distance to the corresponding genes. For visualization, a pseudo-count of 10−5 was added to zero values. p values were calculated using a one-sided Mann–Whitney U test (∗∗∗∗p < 0.0001). Sample sizes, medians, and p values are provided in Supplemental Table 6.

(E) Bubble plot showing the motif-enrichment rank and π value (−log10(p value) × enrichment fold), calculated using MINI-AC, for different TF families with motifs enriched in TEs within iCREs that are associated with genes annotated with “defense response to bacterium,” “regulation of flower development,” or “response to salt stress,” and that belong to the TE superfamilies Tc1/Mariner or PIF/Harbinger. Only families that have at least one motif with a motif-enrichment rank of 30 or less are shown.

Next, to investigate whether the epigenetic profiles of the TEs within iCREs differed from those of TEs outside iCREs, we compared their overlap with various chromatin states. Chromatin states are defined by specific combinations of epigenetic features, including histone marks, histone variants, DNA methylation, ACRs, and binding sites of chromatin-associated factors (Liu et al., 2018). We observed that TEs within iCREs exhibited a chromatin state enrichment pattern similar to that of the iCREs themselves, which indicates that they tend to be located in promoter, intronic, UTR, and intergenic regions (Figure 4C). In contrast, the full set of TEs and the set of TEs outside iCREs displayed the opposite pattern, with enrichment in chromatin states associated with repetitive, DNA-methylated regions. Furthermore, we examined the chromatin state enrichment patterns by TE superfamily and repetitive element type both inside and outside iCREs. We found that the Helitron and TIR TEs within iCREs, particularly Tc1/Mariner, CACTA, PIF/Harbinger, Mutator, and hAT, had the greatest similarity to the chromatin state enrichment pattern of the iCREs (Supplemental Figure 9), which suggests that they represent functional regulatory DNA.

In addition to examining the epigenetic profiles of the TEs within iCREs, we analyzed the expression levels of nearby genes. We first evaluated whether the distribution of the distances to the closest gene differed between TEs that overlap with iCREs and those that do not. TEs within iCREs were generally closer to genes than those outside iCREs (Supplemental Figure 10; median distance 7790 bp within iCREs vs. 24 705 bp outside iCREs; p < 2.2 × 10−16, one-sided Mann–Whitney U test). Given this observation, we used a maize drought gene expression atlas to compare the expression of genes associated with TEs located within iCREs, outside iCREs, or both. In both control and drought conditions, the median expression of genes associated with TEs in iCREs was significantly higher than that of genes outside iCREs when the TEs were either within 2000 bp of the genes or within the gene body (i.e., in introns or UTRs). The median transcripts per million (TPM) was 5.4 (within 2000 bp) and 5.9 (within gene bodies) in iCREs, compared with 0.9 and 0.9, respectively, for TEs outside iCREs (p < 0.0001, one-sided Mann–Whitney U test; Figure 4D; see statistics in Supplemental Table 6). This trend was also observed for the expression levels in both control and drought samples when considering only genes associated with Tc1/Mariner and PIF/Harbinger TEs, which were enriched in chromatin states typical of regulatory DNA (Supplemental Table 6).

To further investigate the regulatory function of the TEs within iCREs, we analyzed a public self-transcribing-active-regulatory-region-sequencing (STARR-seq) dataset (Ricci et al., 2019) that quantifies the enhancer activity of maize genomic DNA fragments. The Tc1/Mariner, CACTA, PIF/Harbinger, Mutator, and hAT superfamilies showed significantly higher enhancer activity within iCREs compared to outside iCREs (p < 2.2 × 10−16; Supplemental Table 7). For example, the CACTA TIR superfamily showed a median enhancer activity score of 0.205 in iCREs and 0 outside iCREs. These results show that TEs from these superfamilies, when located within iCREs, are not only associated with epigenetic features of regulatory DNA but also display increased regulatory activity in vivo.

Finally, we performed a motif analysis using PIF/Harbinger and Tc1/Mariner TEs that fully overlapped with iCREs and were associated with genes functionally annotated with “response to salt stress,” “defense response to bacterium,” or “regulation of flower development” (the most-enriched GO categories; Figure 4E). Motifs for MYB TFs and high-mobility-group (HMG) TFs were enriched across all test sets. MYB motifs were particularly enriched within PIF/Harbinger TEs associated with flower development genes, and HMG motifs were enriched within Tc1/Mariner TEs associated with salt stress response genes. In the latter, SBP family motifs also showed strong enrichment. Motifs for the bHLH TF family were highly enriched within PIF/Harbinger TEs associated with flower development genes. In Tc1/Mariner TEs associated with bacterial response genes, WRKY motifs were highly enriched, as expected, but this was not observed for the PIF/Harbinger TEs (Chen et al., 2019). We then investigated whether ChIP-seq binding sites identified in vivo (Tu et al., 2020) for these TF families were overrepresented within the PIF/Harbinger and Tc1/Mariner TE superfamilies (Supplemental Table 8). We found that MYB family motifs had an enrichment π value (see Methods for TE analysis) of 35.5 within PIF/Harbinger TEs located within iCREs, compared with 6.9 outside iCREs. A similar trend was also observed for bHLH family motifs (31.5 within iCREs vs. 6.5 outside). For the Tc1/Mariner superfamily, MYB and WRKY family motifs had π values of 22.9 and 19.8, respectively, within iCREs, compared with 4.0 and 3.9 outside iCREs. These observations indicate that specific TE superfamilies within iCREs harbor functional binding sites targeted by specific TFs.

Overall, these results highlight the potential regulatory role of specific TE superfamilies in maize. The overrepresentation of TFBSs within TEs from these superfamilies, along with the enrichment of genes associated with specific biological processes, suggests that the regulatory wiring between these TF families and their target genes is partly or entirely mediated by specific TEs.

Discussion

The integration of datasets from diverse regulatory DNA profiling methods in plants holds significant potential to enhance our understanding of gene regulation. CRE-profiling methods such as ChIP-seq detect direct TF binding but are relatively low throughput. In contrast, profiling chromatin accessibility, DNA methylation, or genomic conservation offers a more efficient way to generate a comprehensive map of CREs. Previous studies have focused on concordance among ChIP-seq, CNSs, ACRs, and UMRs, but their complementarity and integrative potential have largely been unexplored (Ricci et al., 2019; Crisp et al., 2020; Song et al., 2021; Marand et al., 2023). We demonstrated that regions identified by multiple methods are more likely to be validated by ChIP-seq and that each method identifies a substantial number of unique regions, indicating their complementarity. Although genomic conservation increases confidence in a region’s regulatory role, our benchmark demonstrates that it is less reliable than methylation or accessibility profiling. Indeed, CNSs are prone to both false positives, e.g., conserved non-coding regulatory RNAs, and false negatives, e.g., TFBSs that are evolutionarily recent (and therefore lack conservation) or located in TEs (which are usually masked in sequence conservation analyses) (Siepel et al., 2005; Haudry et al., 2013; Van de Velde et al., 2014). Among the CNS methods evaluated, the method of Song et al. (2021) provided the best performance. By covering a smaller evolutionary distance than other methods (only species within the Andropogoneae tribe), this method may detect more evolutionarily recent TFBSs, enhancing recall while maintaining precision. Subsequently, we found chromatin accessibility to be a better approach to identify functional CREs than CNS methods. However, ACRs may miss TFBSs of pioneer TFs that do not require chromatin accessibility to perform their function (Lai et al., 2021; Strader et al., 2022).

Organ and tissue specificity in our datasets should be taken into account when interpreting our results. Our benchmark identifies UMRs as the most informative method for predicting TFBSs. However, the superior performance of UMRs may be due to their profiling in maize leaf tissue, which was also used for the ChIP-seq gold standard. Although UMRs are considered relatively stable across organs and tissues, some degree of specificity remains, potentially leading to an overestimation of their predictive performance (Crisp et al., 2020). In contrast, the ACR datasets we compiled integrate data from multiple organs and tissues and are therefore not expected to exhibit strong tissue specificity. Notably, a single-cell assay for transposase-accessible chromatin sequencing dataset was included (Marand et al., 2021), representing a high-resolution set of ACRs across multiple organs and cell types. However, because the ACR compendium does not include every cell type, developmental stage, or condition, the dataset likely still exhibits some context-specific biases (e.g., missing CREs associated with specific in vivo contexts). In contrast, genomic conservation is inherently context independent. The leaf specificity of the gold-standard ChIP-seq experiments may have resulted in an underestimation of CNS performance when assessing all CREs across the maize genome. Supporting this hypothesis, the additional benchmark using the less organ-specific UMR silver standard showed that CNSs outperformed ACRs. This indicates that CNSs are not necessarily less informative than ACRs when evaluated in a more context-independent framework.

Assembling the iCREs enabled the extension of MINI-AC’s functionality to infer context-specific GRNs using the iCREs and a list of genes. Current state-of-the-art GRN inference methods focus heavily on single-cell (multi-)omics datasets because of their unprecedented resolution (Ferrari et al., 2022; Jiang et al., 2022; Bravo González-Blas et al., 2023; Fleck et al., 2023; Kamimoto et al., 2023). However, these protocols are not yet optimized for all model species and experimental conditions, and they are also expensive, which can make them suboptimal or unnecessary for certain biological questions. Conversely, GRN inference from a specific list of genes can identify candidate regulators in a simple and straightforward manner. This approach is supported by tools such as TF2Network, TDTHub, ConnecTF, and PlantRegMap (Kulkarni et al., 2018; Tian et al., 2020; Brooks et al., 2021; Grau and Franco-Zorrilla, 2022). However, these tools test motif enrichment using hypergeometric or Fisher’s exact tests on TFBSs within predefined genomic windows adjacent to genes (e.g., upstream promoter regions), which fails to capture distal CREs. In contrast, our iCRE-based framework evaluates motif enrichment across the entire non-coding genome and reduces potential false-positive TFBSs, thereby improving regulator prediction. The “max F1” iCRE set, which represents the overlap between the UMRs and ACRs, had the highest agreement (F1 score) with the leaf ChIP-seq gold standard. The “all iCREs” set, comprising all UMRs, ACRs, and CNSs, showed lower agreement with the gold standard. However, when used to infer GRNs, it marginally improved the prediction of enriched binding sites for DE regulators among a given set of input genes. This is not unexpected, as the gold standard is derived from mesophyll cells under normal conditions, and the predicted GRNs reflect drought responses across multiple organs. Therefore, elements unique to the “all iCREs” set may include additional CREs not captured by the mesophyll-derived ChIP-seq gold standard. Integration with a leaf eQTL dataset showed that the “all iCREs” dataset was highly enriched for lead SNPs of drought-specific eQTLs, which supports the ability of iCREs to demarcate regulatory DNA. Furthermore, the drought-responsive GRNs, when constructed using DE genes and the “all iCREs” dataset, showed significant overlap with the eQTL-based GRN, thereby enhancing the identification of novel regulators and target genes involved in organ-specific transcriptional regulation.

The regulatory role of PIF/Harbinger TEs identified in our analysis is consistent with findings in Brassica oleracea. In this species, insertion of a TE from this superfamily altered the expression of a TF responsible for purple traits in cultivars. Reporter assays showed that promoter sequences containing this insertion induced higher expression than promoter sequences without it (Li et al., 2024). These results are in line with our observation that genes near TEs located within or outside iCREs have higher expression levels than those near only TEs outside iCREs. This indicates that these TEs promote, rather than repress, gene expression, consistent with a previous study that demonstrated an important role for PIF/Harbinger-driven enhancers in the regulation of husk development (Fagny et al., 2020). The role of TEs in stress-related GRNs has previously been assessed in A. thaliana and tomato; as in the current study, TF motifs were found to be enriched in TEs from different superfamilies, e.g., WRKY TF motifs in Mariner TEs (Deneweth et al., 2022). In maize, the contribution of TEs to stress response has also been studied (Le et al., 2014; Makarevitch et al., 2015; Roquis et al., 2021), along with their potential regulatory roles within ACRs (Zhao et al., 2018; Noshay et al., 2021). Our findings support previously proposed models in which TEs contribute to GRN rewiring (Britten and Davidson, 1971; Feschotte, 2008; Rebollo et al., 2012). Although this phenomenon has been observed in animals and several plant species (Wang et al., 2007; Kunarso et al., 2010; Lynch et al., 2011; Batista et al., 2019), our results suggest that TIR TEs play a similar role in the wiring of maize stress-response networks.

In our study, the assumption that each iCRE regulates its nearest gene fails to account for long-range regulatory interactions or cases in which one iCRE regulates more than one gene. The generation and integration of high-quality Hi-C (Lieberman-Aiden et al., 2009) and chromatin interaction analysis with paired-end tag sequencing (ChIA-PET) (Fullwood et al., 2009) datasets for maize under diverse conditions would resolve these limitations. However, in maize, these datasets are currently only available for a few conditions. Their predictive utility in other organs, tissues, conditions, or genotypes is limited, because long-range interactions can be tissue specific (Marand et al., 2021). The development of a context-independent gold standard of CREs in maize with minimal false positives would also improve the data-driven optimization and integration of iCREs. As demonstrated by our drought eQTL data, iCREs can be integrated with datasets that associate natural genetic variations with phenotypic outcomes to investigate the role of non-coding variants within TFBSs and GRNs in determining complex traits.

In summary, our study demonstrates the potential of integrating data derived from multiple methods to advance regulatory genomic analysis across biological contexts. By combining epigenetic and genomic data, we generated a comprehensive set of maize CREs, thereby supporting the maize research community and highlighting the complexity of regulatory DNA in complex plant genomes.

Methods

CNS inference

The genomic coordinates of CNSs from Song et al. (2021) were obtained from https://genome.cshlp.org/content/suppl/2021/06/22/gr.266528.120.DC1. The pairwise CNS coordinate files were concatenated and merged using bedtools merge (bedtools version 2.2.28) (Quinlan and Hall, 2010). Coordinates were then converted from version 4 of the maize B73 reference genome to version 5 using the UCSC liftOver tool (Hinrichs et al., 2006) with default parameters and the version 4-to-5 chain file from MaizeGDB (www.maizegdb.org) (Woodhouse et al., 2021). Genome-wide maize CNSs predicted by FunTFBS were downloaded from the PlantRegMap portal (http://plantregmap.gao-lab.org) (Tian et al., 2020) and converted from version 3 to version 4 using the Ensembl Plants Assembly Converter tool for maize (Martin et al., 2023) with default parameters, then to version 5 using liftOver as described above. CNSs predicted by the Conservatory project (Hendelman et al., 2021) were downloaded as version 5 coordinates from https://conservatorycns.com/dist/pages/conservatory/index.php.

BLSSpeller (De Witte et al., 2015) and msa_pipeline (Wu et al., 2022) were run using a species set from the PACMAD clade: Cenchrus purpureus, Oropetium thomaeum, Saccharum spontaneum, Setaria italica, Sorghum bicolor, Zea mays, and Zoysia japonica ssp. nagarizaki. Orthologous groups and a species tree were generated using OrthoFinder (Emms and Kelly, 2019) version 2.5.4 with default settings and proteome files consisting of proteins translated from the longest transcript for each gene, retrieved from PLAZA version 5.0 (Van Bel et al., 2022). The orthologous groups were filtered to retain only those with genes from at least two species, including maize B73. Genome FASTA files used by BLSSpeller and msa_pipeline were also retrieved from PLAZA version 5.0. Repeats were masked using the k-mer-based approach described by Song et al. (2021), using KAT (Mapleson et al., 2017). BLSSpeller was used with a search space for each gene extending from 2 kb upstream of the translation start site to 1 kb downstream of the translation end site, including the introns. Coding sequences were subsequently masked. BLSSpeller (version 1.1; source code: https://bitbucket.org/dries_decap/bls-speller-spark) was used in alignment-free mode on the upstream, downstream, and intronic regions independently, using an 8-bp motif length, the full IUPAC alphabet of degenerate nucleotides, and allowing up to three degenerate characters in each motif. The confidence score cutoff was 85, the minimum number of conserved orthologous groups was 25, and the range of BLS thresholds was [50, 60, 70, 75, 80, 85, 90, 92, 95, 97, 99]. msa_pipeline was used as described by Wu et al. (2022) (available at https://bitbucket.org/bucklerlab/msa_pipeline), using the relaxed LAST parameter set. The initial LAST step using lastdb was run using -u RY32 to reduce runtime. Pairwise alignments were chained and netted (Kent et al., 2003), followed by multiple genome alignment using ROAST (https://www.bx.psu.edu/∼cathy/toast-roast.tmp/README.toast-roast.html), with Z. mays as the reference species. For polyploid comparator species (Triticum aestivum, Triticum turgidum, and S. spontaneum), each subgenome was treated as a separate species. An adaptation of the conservation analysis procedure from msa_pipeline was used, in which conserved elements per chromosome were identified using GERP++ (Davydov et al., 2010) with default parameters. Conservation scores were generated using gerpcol, and conserved elements were identified using gerpelem.

Experimental regulatory datasets

ChIP-seq summit data for 104 TFs in maize were obtained from Tu et al. (2020). Peak regions were defined as 10 bp upstream and downstream of each summit, totaling 21 bp. The peak regions of all individual TFs were concatenated and merged using bedtools merge. The genomic coordinates of UMRs were downloaded from Supplemental Dataset 3 of Crisp et al. (2020). Publicly available ACR datasets were collected from GEO and PlantCADB (https://bioinfor.nefu.edu.cn/PlantCADB/; Ding et al., 2022). Thirteen high-quality ACR datasets were selected by excluding negative-control experiments (naked DNA) and datasets that, similar to negative-control samples, showed a high fraction (>70%; Supplemental Figure 1) of distal regions (>2 kb from any gene), which indicates low quality. Processed BED files were obtained from GEO for GSE120304 (leaf and ear) (Ricci et al., 2019) and GSE155178 (the provided peak file was converted to BED) (Marand et al., 2021). Additional datasets—GSE128434 (Lu et al., 2019), GSE85203 (Lu et al., 2017), GSE94291 (Oka et al., 2017), GSE97369 (Burgess et al., 2019), PRJNA382414 (Zhao et al., 2018), PRJNA391551 (Dong et al., 2017), PRJNA518749 (Tu et al., 2020), and PRJNA599454 (Sun et al., 2020)—were downloaded from PlantCADB (Ding et al., 2022). Replicates for each experiment were combined using bedtools intersect. All ACR datasets were concatenated and merged into a single cross-tissue dataset using bedtools merge. All coordinates were converted from version 4 to version 5 of the maize B73 reference genome using the UCSC liftOver tool with default parameters and the version 4-to-5 chain file from MaizeGDB (www.maizegdb.org).

Performance evaluation

Comparisons between query and reference datasets containing genomic coordinates were performed by calculating precision, recall, and F1 score as follows:

Precision=sizeofoverlapbetweenqueryandreference(bp)sizeofquery(bp)
Recall=sizeofoverlapbetweenqueryandreference(bp)sizeofreference(bp)
F1=2·Precision·RecallPrecision+Recall

Performance was calculated using only the non-coding regions of the genome (as defined by the GFF file from PLAZA Monocots 5.0). First, query and reference regions were intersected with the non-coding genome using bedtools intersect, after which the above formulas were applied.

Calculation of enrichment between sets of genomic regions

Enrichment of a query set of genomic regions (e.g., CNSs) in a target set of genomic regions (e.g., ChIP-seq peaks) was tested by shuffling the target set 1000 times across either the maize non-coding genome or the full genome. The observed overlap (real overlap) between the query and target sets was compared with the overlap between the query and shuffled sets (overlap expected by chance). A p value was calculated as the proportion of the 1000 shuffled sets with an overlap higher than the real overlap. Fold enrichment was calculated as the real overlap divided by the median overlap expected by chance (median overlap for the 1000 shuffled sets).

Construction of iCRE sets

The “max F1 iCREs” set was generated by intersecting the UMRs with the cross-tissue ACR dataset using bedtools intersect (see above). The “all iCREs” set was generated by concatenating and merging (using bedtools merge) all UMRs and ACRs together with the CNSs generated via the method of Song et al. (2021). Coding sequences, as defined by the maize version 5 annotation, and a small (59 kb) set of high-confidence predicted tRNAs (retrieved from http://gtrnadb.ucsc.edu/) were subtracted from both iCRE sets using bedtools subtract. Genomic annotation for all regions was performed using the maize GFF file from PLAZA Monocots 5.0.

Processing of the gene expression atlas of maize under drought conditions

Manual curation of Z. mays drought-related RNA-seq datasets was performed using CurSE (Vaneechoutte and Vandepoele, 2019) with default parameters. Twenty studies, summarized in Supplemental Table 2, were selected. A total of 434 samples and their corresponding FASTQ files were downloaded using fastq-dump -I --split-files --gzip (version 3.0.0; https://github.com/ncbi/sra-tools), and the RNA-seq data were processed using the nf-core/rnaseq pipeline (Ewels et al., 2020). Sample sheets were generated using the fastq_dir_to_samplesheet.py script from the nf-core/rnaseq repository. The transcriptome, genome FASTA, and GTF files from version 5 of the maize genome were downloaded from PLAZA Monocots 5.0 (Van Bel et al., 2022). Only stages 1 (pre-processing) and 3 (pseudo-alignment and quantification) of the nf-core/rnaseq pipeline were executed, using default parameters except for the specification of Salmon as the pseudo-aligner for quantification. The Salmon index was built using salmon index with default parameters and no decoy sequences (Patro et al., 2017). Samples with mapping rates below 65% were discarded, along with outlier samples within each study with mapping rates below 70% (Supplemental Figure 4A). Outliers were defined as samples with a mapping rate below Q1 − 1.5 × (Q3 − Q1), where Q1 and Q3 are the first and third quartiles, respectively. The control or treatment samples paired with samples discarded due to low mapping rates were also removed, leaving a total of 375 samples (141 after collapsing replicates).

Identification of drought-responsive DEGs

Differential expression analysis was performed using edgeR (Robinson et al., 2010) with default parameters. Genes with an expression level below one transcript per million (TPM) in at least two samples were discarded. DEGs were inferred using glmQLFit and glmQLFTest, with multiple testing correction and false discovery rate (FDR) calculations performed using the Benjamini–Hochberg procedure (Benjamini and Hochberg, 1995). For each contrast, genes with FDR < 0.05 and absolute log fold change >1 were classified as DE. Of the 80 contrasts analyzed, a large proportion yielded no or very few DEGs (Supplemental Figure 4B). Contrasts with fewer than 100 upregulated or downregulated genes were considered low quality and therefore discarded, leaving a total of 55 contrasts. Activation and repression consistency scores for DEGs were calculated as the number of contrasts in which the gene was upregulated or downregulated divided by the total number of contrasts in which it was DE.

iCRE-based GRN inference

The “max F1” and “all iCRE” sets were individually assigned to their closest gene using bedtools closest -a $icres -b $genes -t all. MINI-AC was used for motif enrichment and GRN inference (Manosalva Pérez et al., 2024) using the iCREs associated with each drought-responsive gene set. MINI-AC was run in genome-wide mode on version 5 of the maize genome with the “absolute base pair count” option set to “true.” An iCRE-based MINI-AC feature was added to the GitHub repository (https://github.com/VIB-PSB/MINI-AC/tree/main) to enable context-specific GRN inference in maize from a given list of genes. The coordinates of the neighboring non-coding regions and introns of each gene were obtained from the maize GFF file from PLAZA 5.0 and overlapped with the “medium” promoter file from the MINI-AC GitHub repository (https://github.com/VIB-PSB/MINI-AC/blob/main/data/zma_v5/zma_v5_promoter_5kbup_1kbdown_sorted.bed). This file was used as the input for MINI-AC, using the same parameters as above, to perform promoter-based motif enrichment. To compare the promoter-based and iCRE-based motif-enrichment approaches, precision was calculated as the number of enriched motifs (relative to all promoter regions) associated with DE TFs divided by the total number of enriched motifs. Recall was calculated as the number of enriched motifs associated with DE TFs divided by the total number of motifs associated with DE TFs. The F1 score was calculated as the harmonic mean of precision and recall. The AUPRC was calculated using the metrics.auc function from the Python library scikit-learn (Pedregosa et al., 2011) using default parameters.

eQTL analysis

eQTL data were obtained from Liu et al. (2020) (Supplemental Table 2). Drought-specific eQTLs were defined as those identified in either of the two drought conditions but not in the well-watered condition. Coordinates were converted from maize B73 version 4 to version 5 using the UCSC liftOver tool (Hinrichs et al., 2006) with default parameters and the chain file from MaizeGDB (www.maizegdb.org). Enrichment of eQTL lead SNPs was performed as described above in the section on enrichment between sets of genomic regions, using iCREs shuffled 1000 times within the non-coding genome to create the background model. The significance of the overlap between a MINI-AC GRN and the eQTL GRN was determined using a background set of 1000 randomized GRNs, constructed by randomly pairing TFs with target genes from the MINI-AC GRN to create random edges. A p value for the overlap was calculated as the proportion of times that the overlap with a randomized GRN was higher than the real overlap.

TE analysis, chromatin state enrichment, and STARR-seq

Coordinates for the transposable and repetitive elements in version 5 of the maize genome were downloaded from MaizeGDB (https://download.maizegdb.org/Zm-B73-REFERENCE-NAM-5.0/Zm-B73-REFERENCE-NAM-5.0.TE.gff3.gz) (Woodhouse et al., 2021). The coordinates were grouped by TE superfamily based on the third column (“feature”) of the GFF file. The genomic coordinates of chromatin states were obtained from the Plant Chromatin State Database (http://systemsbiology.cau.edu.cn/chromstates/download.php; Liu et al., 2018) and converted from version 3 of the maize genome to version 4, then to version 5 using liftOver with default parameters (Hinrichs et al., 2006). The chain files used by liftOver were downloaded from Ensembl (Martin et al., 2023; https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-55/assembly_chain/zea_mays/AGPv3_to_B73_RefGen_v4.chain.gz and https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-55/assembly_chain/zea_mays/B73_RefGen_v4_to_Zm-B73-REFERENCE-NAM-5.0.chain.gz). TEs fully contained within iCREs were identified using bedtools intersect -a $tes -b $icres -f 1 and grouped by TE superfamily. Enrichment of TEs, TE superfamilies, and other genomic region sets within the iCREs and chromatin states was evaluated as described earlier in this section. Shuffling was performed across the non-coding genome for iCRE enrichment and across the whole genome for chromatin state enrichment. Enrichment was visualized using the π value, calculated as −log10(p value) × fold enrichment. The p values were adjusted for multiple testing using the Benjamini–Hochberg method with a significance threshold of 0.01.

Processed STARR-seq data were obtained from GEO (GSE120304) (Ricci et al., 2019), and raw input library reads were downloaded from SRA (SRR10964903). The reads were trimmed using Trimmomatic v0.36 (parameters: SLIDINGWINDOW:3:20 LEADING:0 TRAILING:0 MINLEN:30) (Bolger et al., 2014) and aligned to the B73 RefGen v5 genome (PLAZA Monocots 5.0; Van Bel et al., 2022) using Bowtie v0.12.7 (bowtie -t -v 1 -X 2000 --best --strata -m 1 -S) (Langmead et al., 2009). Mapped input reads were converted to BED format (using bedtools bamtobed) and merged (using bedtools merge) with default parameters. Within these input regions, enhancer activity scores for TEs were calculated as the mean enhancer activity across all base pairs within each TE. The increase in enhancer activity for TEs located within iCREs compared with those outside iCREs was tested using a one-sided Mann–Whitney U test, with multiple testing correction via the Benjamini–Hochberg method.

For functional analysis, TEs were linked to their closest gene using bedtools closest -a $tes_in_icres -b $genes -t all -d. GO enrichment was evaluated using a gene–GO annotation file for version 5 of the maize genome (https://github.com/VIB-PSB/MINI-AC/blob/main/data/zma_v5/zma_v5_go_gene_file.txt) and a hypergeometric test with p value correction using the Benjamini–Hochberg method.

iCREs linked to genes that were associated with the PIF/Harbinger and Tc1/Mariner TE superfamilies and that showed enrichment for the GO terms “defense response to bacterium,” “regulation of flower development,” or “response to salt stress” were used for motif enrichment analysis using MINI-AC with the same parameters described earlier in this section. The expression levels of genes associated with TEs within iCREs were determined using TPM values per gene and per sample from the drought gene expression atlas. Gene distances were obtained using bedtools closest, as described above.

Funding

This work was supported by a Bijzonder Onderzoeksfonds grant from Ghent University (grant agreement: BOF24Y2019001901) to N.M.P.

Acknowledgments

We would like to thank Dries Decap and Jan Fostier for their support and help running BLSSpeller. We also thank Chrystian Camilo Sosa Arango for sharing an RNA-seq differential expression pipeline used to call drought-responsive DEGs. No conflict of interest is declared.

Author contributions

K.V., N.M.P., and J.S. conceived and designed the research. J.S. performed the benchmark and complementarity analysis of CRE-profiling methods. N.M.P. and J.S. performed the analysis of the GRNs and TEs. I.D. ran msa_pipeline. A.M.F. processed the gene expression atlas of maize drought. N.M.P., J.S., and K.V. wrote the manuscript with input from other authors.

Declaration of generative AI and AI-assisted technologies in the writing process

During the preparation of this work, the authors used ChatGPT to rephrase content written by the authors. After using this tool/service, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication.

Published: May 13, 2025

Footnotes

Supplemental information is available at Plant Communications Online.

Supplemental information

Document S1. Supplemental Figures 1–10 and Supplemental Datasets 1–3

Supplemental Datasets: files containing the genomic coordinates of the CNSs (BLSSpeller, msa_pipeline, FunTFBS, Song-2021, and Conservatory), the accessible chromatin region compendium (ACRs), and the unmethylated regions (UMRs) used to generate the iCREs, along with both the “all iCRE” and “max F1 iCRE” sets, are available at https://doi.org/10.5281/zenodo.15143951. These datasets are provided for both versions 4 and 5 of the maize genome.

mmc1.pdf (4MB, pdf)
Data S1. Supplemental Tables 1–8
mmc2.xlsx (192.3KB, xlsx)
Document S2. Article plus supplemental information
mmc3.pdf (7.4MB, pdf)

References

  1. Batista R.A., Moreno-Romero J., Qiu Y., van Boven J., Santos-González J., Figueiredo D.D., Köhler C. The MADS-box transcription factor PHERES1 controls imprinting in the endosperm by binding to domesticated transposons. Elife. 2019;8 doi: 10.7554/eLife.50541. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Benjamini Y., Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J. Roy. Stat. Soc. B. 1995;57:289–300. [Google Scholar]
  3. Bolger A.M., Lohse M., Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Boyle A.P., Davis S., Shulha H.P., Meltzer P., Margulies E.H., Weng Z., Furey T.S., Crawford G.E. High-Resolution Mapping and Characterization of Open Chromatin across the Genome. Cell. 2008;132:311–322. doi: 10.1016/j.cell.2007.12.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bravo González-Blas C., De Winter S., Hulselmans G., Hecker N., Matetovici I., Christiaens V., Poovathingal S., Wouters J., Aibar S., Aerts S. SCENIC+: single-cell multiomic inference of enhancers and gene regulatory networks. Nat. Methods. 2023;20:1–13. doi: 10.1038/s41592-023-01938-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Breuer C., Morohashi K., Kawamura A., Takahashi N., Ishida T., Umeda M., Grotewold E., Sugimoto K. Transcriptional repression of the APC/C activator CCS52A1 promotes active termination of cell growth. EMBO J. 2012;31:4488–4501. doi: 10.1038/emboj.2012.294. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Britten R.J., Davidson E.H. Repetitive and non-repetitive DNA sequences and a speculation on the origins of evolutionary novelty. Q. Rev. Biol. 1971;46:111–138. doi: 10.1086/406830. [DOI] [PubMed] [Google Scholar]
  8. Brooks M.D., Juang C.-L., Katari M.S., Alvarez J.M., Pasquino A., Shih H.-J., Huang J., Shanks C., Cirrone J., Coruzzi G.M. ConnecTF: A platform to integrate transcription factor-gene interactions and validate regulatory networks. Plant Physiol. 2021;185:49–66. doi: 10.1093/plphys/kiaa012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Buenrostro J.D., Wu B., Chang H.Y., Greenleaf W.J. ATAC-seq: A Method for Assaying Chromatin Accessibility Genome-Wide. Curr. Protoc. Mol. Biol. 2015;109:21.29.1–21.29.9. doi: 10.1002/0471142727.mb2129s109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Burgess S.J., Reyna-Llorens I., Stevenson S.R., Singh P., Jaeger K., Hibberd J.M. Genome-Wide Transcription Factor Binding in Leaves from C3 and C4 Grasses. Plant Cell. 2019;31:2297–2314. doi: 10.1105/TPC.19.00078. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Caro E., Desvoyes B., Gutierrez C. GTL1 keeps cell growth and nuclear ploidy under control. EMBO J. 2012;31:4483–4485. doi: 10.1038/emboj.2012.311. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Chen D., Yan W., Fu L.Y., Kaufmann K. Architecture of gene regulatory networks controlling flower development in Arabidopsis thaliana. Nat. Commun. 2018;9:4534–4613. doi: 10.1038/s41467-018-06772-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Chen X., Li C., Wang H., Guo Z. WRKY transcription factors: evolution, binding, and action. Phytopathol. Res. 2019;1:13. doi: 10.1186/s42483-019-0022-x. [DOI] [Google Scholar]
  14. Clark N.M., Nolan T.M., Wang P., Song G., Montes C., Valentine C.T., Guo H., Sozzani R., Yin Y., Walley J.W. Integrated omics networks reveal the temporal signaling events of brassinosteroid response in Arabidopsis. Nat. Commun. 2021;12:5858. doi: 10.1038/s41467-021-26165-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Cooper G.M., Brown C.D. Qualifying the relationship between sequence conservation and molecular function. Genome Res. 2008;18:201–205. doi: 10.1101/gr.7205808. [DOI] [PubMed] [Google Scholar]
  16. Crisp P.A., Marand A.P., Noshay J.M., Zhou P., Lu Z., Schmitz R.J., Springer N.M. Stable unmethylated DNA demarcates expressed genes and their cis-regulatory space in plant genomes. Proc. Natl. Acad. Sci. USA. 2020;117:23991–24000. doi: 10.1073/pnas.2010250117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Crisp P.A., Ganguly D.R., Smith A.B., Murray K.D., Estavillo G.M., Searle I., Ford E., Bogdanović O., Lister R., Borevitz J.O., et al. Rapid Recovery Gene Downregulation during Excess-Light Stress and Recovery in Arabidopsis. Plant Cell. 2017;29:1836–1863. doi: 10.1105/tpc.16.00828. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Davydov E.V., Goode D.L., Sirota M., Cooper G.M., Sidow A., Batzoglou S. Identifying a high fraction of the human genome to be under selective constraint using GERP++ PLoS Comput. Biol. 2010;6 doi: 10.1371/journal.pcbi.1001025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. De Clercq I., Van de Velde J., Luo X., Liu L., Storme V., Van Bel M., Pottie R., Vaneechoutte D., Van Breusegem F., Vandepoele K. Integrative inference of transcriptional networks in Arabidopsis yields novel ROS signalling regulators. Nat. Plants. 2021;7:500–513. doi: 10.1038/s41477-021-00894-1. [DOI] [PubMed] [Google Scholar]
  20. De Witte D., Van De Velde J., Decap D., Van Bel M., Audenaert P., Demeester P., Dhoedt B., Vandepoele K., Fostier J. BLSSpeller: Exhaustive comparative discovery of conserved cis-regulatory elements. Bioinformatics. 2015;31:3758–3766. doi: 10.1093/bioinformatics/btv466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Deneweth J., Van de Peer Y., Vermeirssen V. Nearby transposable elements impact plant stress gene regulatory networks: a meta-analysis in A. thaliana and S. lycopersicum. BMC Genom. 2022;23:18. doi: 10.1186/s12864-021-08215-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Depuydt T., De Rybel B., Vandepoele K. Charting plant gene functions in the multi-omics and single-cell era. Trends Plant Sci. 2023;28:283–296. doi: 10.1016/j.tplants.2022.09.008. [DOI] [PubMed] [Google Scholar]
  23. Ding K., Sun S., Luo Y., Long C., Zhai J., Zhai Y., Wang G. PlantCADB: A Comprehensive Plant Chromatin Accessibility Database. Genom. Proteom. Bioinform. 2023;21:311–323. doi: 10.1016/j.gpb.2022.10.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Ding P., Sakai T., Krishna Shrestha R., Manosalva Perez N., Guo W., Ngou B.P.M., He S., Liu C., Feng X., Zhang R., et al. Chromatin accessibility landscapes activated by cell-surface and intracellular immune receptors. J. Exp. Bot. 2021;72:7927–7941. doi: 10.1093/jxb/erab373. [DOI] [PubMed] [Google Scholar]
  25. Dong P., Tu X., Chu P.-Y., Lü P., Zhu N., Grierson D., Du B., Li P., Zhong S. 3D Chromatin Architecture of Large Plant Genomes Determined by Local A/B Compartments. Mol. Plant. 2017;10:1497–1509. doi: 10.1016/j.molp.2017.11.005. [DOI] [PubMed] [Google Scholar]
  26. Dubois M., Inzé D. Plant growth under suboptimal water conditions: early responses and methods to study them. J. Exp. Bot. 2020;71:1706–1722. doi: 10.1093/jxb/eraa037. [DOI] [PubMed] [Google Scholar]
  27. Eichten S.R., Springer N.M. Minimal evidence for consistent changes in maize DNA methylation patterns following environmental stress. Front. Plant Sci. 2015;6 doi: 10.3389/fpls.2015.00308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Eichten S.R., Vaughn M.W., Hermanson P.J., Springer N.M. Variation in DNA Methylation Patterns is More Common among Maize Inbreds than among Tissues. Plant Genome. 2013;6 doi: 10.3835/plantgenome2012.06.0009. [DOI] [Google Scholar]
  29. Eli R.M., Daniel L.V., Hank W.B., Edward S.B. Open chromatin reveals the functional maize genome. Proc. Natl. Acad. Sci. USA. 2016;113:3177–3184. doi: 10.1073/pnas.1525244113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Emms D.M., Kelly S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019;20:238. doi: 10.1186/s13059-019-1832-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Ewels P.A., Peltzer A., Fillinger S., Patel H., Alneberg J., Wilm A., Garcia M.U., Di Tommaso P., Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat. Biotechnol. 2020;38:276–278. doi: 10.1038/s41587-020-0439-x. [DOI] [PubMed] [Google Scholar]
  32. Fagny M., Kuijjer M.L., Stam M., Joets J., Turc O., Rozière J., Pateyron S., Venon A., Vitte C. Identification of Key Tissue-Specific, Biological Processes by Integrating Enhancer Information in Maize Gene Regulatory Networks. Front. Genet. 2020;11 doi: 10.3389/fgene.2020.606285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Ferrari C., Manosalva Pérez N., Vandepoele K. MINI-EX: Integrative inference of single-cell gene regulatory networks in plants. Mol. Plant. 2022;15:1807–1824. doi: 10.1016/j.molp.2022.10.016. [DOI] [PubMed] [Google Scholar]
  34. Feschotte C. Transposable elements and the evolution of regulatory networks. Nat. Rev. Genet. 2008;9:397–405. doi: 10.1038/nrg2337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Fleck J.S., Jansen S.M.J., Wollny D., Zenk F., Seimiya M., Jain A., Okamoto R., Santel M., He Z., Camp J.G., Treutlein B. Inferring and perturbing cell fate regulomes in human brain organoids. Nature. 2023;621:365–372. doi: 10.1038/s41586-022-05279-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Fujimori T., Yamashino T., Kato T., Mizuno T. Circadian-Controlled Basic/Helix-Loop-Helix Factor, PIL6, Implicated in Light-Signal Transduction in Arabidopsis thaliana. Plant Cell Physiol. 2004;45:1078–1086. doi: 10.1093/pcp/pch124. [DOI] [PubMed] [Google Scholar]
  37. Fullwood M.J., Liu M.H., Pan Y.F., Liu J., Xu H., Mohamed Y.B., Orlov Y.L., Velkov S., Ho A., Mei P.H., et al. An oestrogen-receptor-α-bound human chromatin interactome. Nature. 2009;462:58–64. doi: 10.1038/nature08497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Ganguly D.R., Crisp P.A., Eichten S.R., Pogson B.J. The Arabidopsis DNA Methylome Is Stable under Transgenerational Drought Stress. Plant Physiol. 2017;175:1893–1912. doi: 10.1104/pp.17.00744. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Gaudinier A., Rodriguez-Medina J., Zhang L., Olson A., Liseron-Monfils C., Bågman A.M., Foret J., Abbitt S., Tang M., Li B., et al. Transcriptional regulation of nitrogen-associated metabolism and growth. Nature. 2018;563:259–264. doi: 10.1038/s41586-018-0656-3. [DOI] [PubMed] [Google Scholar]
  40. Grau J., Franco-Zorrilla J.M. TDTHub, a web server tool for the analysis of transcription factor binding sites in plants. Plant J. 2022;111:1203–1215. doi: 10.1111/tpj.15873. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Gulzar F., Fu J., Zhu C., Yan J., Li X., Meraj T.A., Shen Q., Hassan B., Wang Q. Maize WRKY Transcription Factor ZmWRKY79 Positively Regulates Drought Tolerance through Elevating ABA Biosynthesis. Int. J. Mol. Sci. 2021;22 doi: 10.3390/ijms221810080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Haudry A., Platts A.E., Vello E., Hoen D.R., Leclercq M., Williamson R.J., Forczek E., Joly-Lopez Z., Steffen J.G., Hazzouri K.M., et al. An atlas of over 90,000 conserved noncoding sequences provides insight into crucifer regulatory regions. Nat. Genet. 2013;45:891–898. doi: 10.1038/ng.2684. [DOI] [PubMed] [Google Scholar]
  43. Hendelman A., Zebell S., Rodriguez-Leal D., Dukler N., Robitaille G., Wu X., Kostyun J., Tal L., Wang P., Bartlett M.E., et al. Conserved pleiotropy of an ancient plant homeobox gene uncovered by cis-regulatory dissection. Cell. 2021;184:1724–1739.e16. doi: 10.1016/j.cell.2021.02.001. [DOI] [PubMed] [Google Scholar]
  44. Hinrichs A.S., Karolchik D., Baertsch R., Barber G.P., Bejerano G., Clawson H., Diekhans M., Furey T.S., Harte R.A., Hsu F., et al. The UCSC Genome Browser Database: update 2006. Nucleic Acids Res. 2006;34:D590–D598. doi: 10.1093/nar/gkj144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Hirsch C.D., Springer N.M. Transposable element influences on gene expression in plants. Biochim. Biophys. Acta. Gene Regul. Mech. 2017;1860:157–165. doi: 10.1016/j.bbagrm.2016.05.010. [DOI] [PubMed] [Google Scholar]
  46. Jiang J., Lyu P., Li J., Huang S., Tao J., Blackshaw S., Qian J., Wang J. IReNA: Integrated regulatory network analysis of single-cell transcriptomes and chromatin accessibility profiles. iScience. 2022;25 doi: 10.1016/j.isci.2022.105359. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Jiao Y., Peluso P., Shi J., Liang T., Stitzer M.C., Wang B., Campbell M.S., Stein J.C., Wei X., Chin C.-S., et al. Improved maize reference genome with single-molecule technologies. Nature. 2017;546:524–527. doi: 10.1038/nature22971. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Kajala K., Gouran M., Shaar-Moshe L., Mason G.A., Rodriguez-Medina J., Kawa D., Pauluzzi G., Reynoso M., Canto-Pastor A., Manzano C., et al. Innovation, conservation, and repurposing of gene function in root cell type development. Cell. 2021;184:3333–3348.e19. doi: 10.1016/j.cell.2021.04.024. [DOI] [PubMed] [Google Scholar]
  49. Kamimoto K., Stringa B., Hoffmann C.M., Jindal K., Solnica-Krezel L., Morris S.A. Dissecting cell identity via network inference and in silico gene perturbation. Nature. 2023;614:742–751. doi: 10.1038/s41586-022-05688-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Kawakatsu T., Stuart T., Valdes M., Breakfield N., Schmitz R.J., Nery J.R., Urich M.A., Han X., Lister R., Benfey P.N., et al. Unique cell-type-specific patterns of DNA methylation in the root meristem. Nat. Plants. 2016;2 doi: 10.1038/NPLANTS.2016.58. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Kent W.J., Baertsch R., Hinrichs A., Miller W., Haussler D. Evolution's cauldron: Duplication, deletion, and rearrangement in the mouse and human genomes. Proc. Natl. Acad. Sci. USA. 2003;100:11484–11489. doi: 10.1073/pnas.1932072100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Kimura M. Recent development of the neutral theory viewed from the Wrightian tradition of theoretical population genetics. Proc. Natl. Acad. Sci. USA. 1991;88:5969–5973. doi: 10.1073/pnas.88.14.5969. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Krishnaswamy S., Verma S., Rahman M.H., Kav N.N.V. Functional characterization of four APETALA2-family genes (RAP2.6, RAP2.6L, DREB19 and DREB26) in Arabidopsis. Plant Mol. Biol. 2011;75:107–127. doi: 10.1007/s11103-010-9711-7. [DOI] [PubMed] [Google Scholar]
  54. Kulkarni S.R., Vaneechoutte D., Van de Velde J., Vandepoele K. TF2Network: predicting transcription factor regulators and gene regulatory networks in Arabidopsis using publicly available binding site information. Nucleic Acids Res. 2018;46:e31. doi: 10.1093/nar/gkx1279. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Kunarso G., Chia N.-Y., Jeyakani J., Hwang C., Lu X., Chan Y.-S., Ng H.-H., Bourque G. Transposable elements have rewired the core regulatory network of human embryonic stem cells. Nat. Genet. 2010;42:631–634. doi: 10.1038/ng.600. [DOI] [PubMed] [Google Scholar]
  56. Lai X., Blanc-Mathieu R., GrandVuillemin L., Huang Y., Stigliani A., Lucas J., Thévenon E., Loue-Manifel J., Turchi L., Daher H., et al. The LEAFY floral regulator displays pioneer transcription factor properties. Mol. Plant. 2021;14:829–837. doi: 10.1016/j.molp.2021.03.004. [DOI] [PubMed] [Google Scholar]
  57. Langmead B., Trapnell C., Pop M., Salzberg S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Le T.-N., Schumann U., Smith N.A., Tiwari S., Au P.C.K., Zhu Q.-H., Taylor J.M., Kazan K., Llewellyn D.J., Zhang R., et al. DNA demethylases target promoter transposable elements to positively regulate stress responsive genes in Arabidopsis. Genome Biol. 2014;15:458. doi: 10.1186/s13059-014-0458-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Li E., Liu H., Huang L., Zhang X., Dong X., Song W., Zhao H., Lai J. Long-range interactions between proximal and distal regulatory regions in maize. Nat. Commun. 2019;10:2633. doi: 10.1038/s41467-019-10603-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Li X., Wang Y., Cai C., Ji J., Han F., Zhang L., Chen S., Zhang L., Yang Y., Tang Q., et al. Large-scale gene expression alterations introduced by structural variation drive morphotype diversification in Brassica oleracea. Nat. Genet. 2024;56:517–529. doi: 10.1038/s41588-024-01655-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Lieberman-Aiden E., van Berkum N.L., Williams L., Imakaev M., Ragoczy T., Telling A., Amit I., Lajoie B.R., Sabo P.J., Dorschner M.O., et al. Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome. Science. 2009;326:289–293. doi: 10.1126/science.1181369. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Liu S., Li C., Wang H., Wang S., Yang S., Liu X., Yan J., Li B., Beatty M., Zastrow-Hayes G., et al. Mapping regulatory variants controlling gene expression in drought response and tolerance in maize. Genome Biol. 2020;21:163. doi: 10.1186/s13059-020-02069-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Liu Y., Tian T., Zhang K., You Q., Yan H., Zhao N., Yi X., Xu W., Su Z. PCSD: a plant chromatin state database. Nucleic Acids Res. 2018;46:D1157–D1167. doi: 10.1093/nar/gkx919. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Lu D., Wang T., Persson S., Mueller-Roeber B., Schippers J.H.M. Transcriptional control of ROS homeostasis by KUODA1 regulates cell expansion during leaf development. Nat. Commun. 2014;5:3767. doi: 10.1038/ncomms4767. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Lu Z., Hofmeister B.T., Vollmers C., DuBois R.M., Schmitz R.J. Combining ATAC-seq with nuclei sorting for discovery of cis-regulatory regions in plant genomes. Nucleic Acids Res. 2017;45 doi: 10.1093/nar/gkw1179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Lu Z., Marand A.P., Ricci W.A., Ethridge C.L., Zhang X., Schmitz R.J. The prevalence, evolution and chromatin signatures of plant regulatory elements. Nat. Plants. 2019;5:1250–1259. doi: 10.1038/s41477-019-0548-z. [DOI] [PubMed] [Google Scholar]
  67. Lynch V.J., Leclerc R.D., May G., Wagner G.P. Transposon-mediated rewiring of gene regulatory networks contributed to the evolution of pregnancy in mammals. Nat. Genet. 2011;43:1154–1159. doi: 10.1038/ng.917. [DOI] [PubMed] [Google Scholar]
  68. Maher K.A., Bajic M., Kajala K., Reynoso M., Pauluzzi G., West D.A., Zumstein K., Woodhouse M., Bubb K., Dorrity M.W., et al. Profiling of accessible chromatin regions across multiple plant species and cell types reveals common gene regulatory principles and new control modules. Plant Cell. 2018;30:15–36. doi: 10.1105/tpc.17.00581. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Makarevitch I., Waters A.J., West P.T., Stitzer M., Hirsch C.N., Ross-Ibarra J., Springer N.M. Transposable Elements Contribute to Activation of Maize Genes in Response to Abiotic Stress. PLoS Genet. 2015;11 doi: 10.1371/journal.pgen.1004915. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Manosalva Pérez N., Ferrari C., Engelhorn J., Depuydt T., Nelissen H., Hartwig T., Vandepoele K. MINI-AC: inference of plant gene regulatory networks using bulk or single-cell accessible chromatin profiles. Plant J. 2024;117:280–301. doi: 10.1111/tpj.16483. [DOI] [PubMed] [Google Scholar]
  71. Mapleson D., Garcia Accinelli G., Kettleborough G., Wright J., Clavijo B.J. KAT: a K-mer analysis toolkit to quality control NGS datasets and genome assemblies. Bioinformatics. 2017;33:574–576. doi: 10.1093/bioinformatics/btw663. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Marand A.P., Chen Z., Gallavotti A., Schmitz R.J. A cis-regulatory atlas in maize at single-cell resolution. Cell. 2021;184:3041–3055.e21. doi: 10.1016/j.cell.2021.04.014. [DOI] [PubMed] [Google Scholar]
  73. Marand A.P., Eveland A.L., Kaufmann K., Springer N.M. cis-Regulatory Elements in Plant Development, Adaptation, and Evolution. Annu. Rev. Plant Biol. 2023;74:111–137. doi: 10.1146/annurev-arplant-070122-030236. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Martin F.J., Amode M.R., Aneja A., Austine-Orimoloye O., Azov A.G., Barnes I., Becker A., Bennett R., Berry A., Bhai J., et al. Ensembl 2023. Nucleic Acids Res. 2023;51:D933–D941. doi: 10.1093/nar/gkac958. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Noll M. Subunit structure of chromatin. Nature. 1974;251:249–251. doi: 10.1038/251249a0. [DOI] [PubMed] [Google Scholar]
  76. Noshay J.M., Marand A.P., Anderson S.N., Zhou P., Mejia Guerra M.K., Lu Z., O’Connor C.H., Crisp P.A., Hirsch C.N., Schmitz R.J., Springer N.M. Assessing the regulatory potential of transposable elements using chromatin accessibility profiles of maize transposons. Genetics. 2021;217:1–13. doi: 10.1093/genetics/iyaa003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Oka R., Zicola J., Weber B., Anderson S.N., Hodgman C., Gent J.I., Wesselink J.J., Springer N.M., Hoefsloot H.C.J., Turck F., Stam M. Genome-wide mapping of transcriptional enhancer candidates using DNA and chromatin features in maize. Genome Biol. 2017;18:137–224. doi: 10.1186/s13059-017-1273-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Oudet P., Gross-Bellard M., Chambon P. Electron microscopic and biochemical evidence that chromatin structure is a repeating unit. Cell. 1975;4:281–300. doi: 10.1016/0092-8674(75)90149-X. [DOI] [PubMed] [Google Scholar]
  79. Patro R., Duggal G., Love M.I., Irizarry R.A., Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods. 2017;14:417–419. doi: 10.1038/nmeth.4197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011;12:2825–2830. [Google Scholar]
  81. Quinlan A.R., Hall I.M. BEDTools: A flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Raines T., Shanks C., Cheng C.-Y., McPherson D., Argueso C.T., Kim H.J., Franco-Zorrilla J.M., López-Vidriero I., Solano R., Vaňková R., et al. The cytokinin response factors modulate root and shoot growth and promote leaf senescence in Arabidopsis. Plant J. 2016;85:134–147. doi: 10.1111/tpj.13097. [DOI] [PubMed] [Google Scholar]
  83. Rebollo R., Romanish M.T., Mager D.L. Transposable Elements: An Abundant and Natural Source of Regulatory Sequences for Host Genes. Annu. Rev. Genet. 2012;46:21–42. doi: 10.1146/annurev-genet-110711-155621. [DOI] [PubMed] [Google Scholar]
  84. Reynoso M.A., Kajala K., Bajic M., West D.A., Pauluzzi G., Yao A.I., Hatch K., Zumstein K., Woodhouse M., Rodriguez-Medina J., et al. Evolutionary flexibility in flooding response circuitry in angiosperms. Science. 2019;365:1291–1295. doi: 10.1126/science.aax8862. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Ricci W.A., Lu Z., Ji L., Marand A.P., Ethridge C.L., Murphy N.G., Noshay J.M., Galli M., Mejía-Guerra M.K., Colomé-Tatché M., et al. Widespread long-range cis-regulatory elements in the maize genome. Nat. Plants. 2019;5:1237–1249. doi: 10.1038/s41477-019-0547-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Robinson M.D., McCarthy D.J., Smyth G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Roquis D., Robertson M., Yu L., Thieme M., Julkowska M., Bucher E. Genomic impact of stress-induced transposable element mobility in Arabidopsis. Nucleic Acids Res. 2021;49:10431–10447. doi: 10.1093/nar/gkab828. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Savadel S.D., Hartwig T., Turpin Z.M., Vera D.L., Lung P.-Y., Sui X., Blank M., Frommer W.B., Dennis J.H., Zhang J., Bass H.W. The native cistrome and sequence motif families of the maize ear. PLoS Genet. 2021;17 doi: 10.1371/journal.pgen.1009689. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Schmitz R.J., Grotewold E., Stam M. Cis-regulatory sequences in plants: Their importance, discovery, and future challenges. Plant Cell. 2022;34:718–741. doi: 10.1093/PLCELL/KOAB281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Siepel A., Bejerano G., Pedersen J.S., Hinrichs A.S., Hou M., Rosenbloom K., Clawson H., Spieth J., Hillier L.W., Richards S., et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005;15:1034–1050. doi: 10.1101/gr.3715005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Song B., Buckler E.S., Wang H., Wu Y., Rees E., Kellogg E.A., Gates D.J., Khaipho-Burch M., Bradbury P.J., Ross-Ibarra J., et al. Conserved noncoding sequences provide insights into regulatory sequence and loss of gene expression in maize. Genome Res. 2021;31:1245–1257. doi: 10.1101/gr.266528.120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Soppe W.J.J., Jasencakova Z., Houben A., Kakutani T., Meister A., Huang M.S., Jacobsen S.E., Schubert I., Fransz P.F. DNA methylation controls histone H3 lysine 9 methylation and heterochromatin assembly in Arabidopsis. EMBO J. 2002;21:6549–6559. doi: 10.1093/emboj/cdf657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Strader L., Weijers D., Wagner D. Plant transcription factors — being in the right place with the right company. Curr. Opin. Plant Biol. 2022;65 doi: 10.1016/j.pbi.2021.102136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Sullivan A.M., Arsovski A.A., Lempe J., Bubb K.L., Weirauch M.T., Sabo P.J., Sandstrom R., Thurman R.E., Neph S., Reynolds A.P., et al. Mapping and dynamics of regulatory DNA and transcription factor networks in A. thaliana. Cell Rep. 2014;8:2015–2030. doi: 10.1016/j.celrep.2014.08.019. [DOI] [PubMed] [Google Scholar]
  95. Sun Y., Dong L., Zhang Y., Lin D., Xu W., Ke C., Han L., Deng L., Li G., Jackson D., et al. 3D genome architecture coordinates trans and cis regulation of differentially expressed ear and tassel genes in maize. Genome Biol. 2020;21:143. doi: 10.1186/s13059-020-02063-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Tenorio Berrío R., Verstaen K., Vandamme N., Pevernagie J., Achon I., van Duyse J., van Isterdael G., Saeys Y., de Veylder L., Inzé D., Dubois M. Single-cell transcriptomics sheds light on the identity and metabolism of developing leaf cells. Plant Physiol. 2022;188:898–918. doi: 10.1093/plphys/kiab489. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Tian F., Yang D.-C., Meng Y.-Q., Jin J., Gao G. PlantRegMap: charting functional regulatory maps in plants. Nucleic Acids Res. 2020;48:D1104–D1113. doi: 10.1093/nar/gkz1020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Tu X., Mejía-Guerra M.K., Valdes Franco J.A., Tzeng D., Chu P.Y., Shen W., Wei Y., Dai X., Li P., Buckler E.S., et al. Reconstructing the maize leaf regulatory network using ChIP-seq data of 104 transcription factors. Nat. Commun. 2020;11 doi: 10.1038/s41467-020-18832-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Van Bel M., Silvestri F., Weitz E.M., Kreft L., Botzki A., Coppens F., Vandepoele K. PLAZA 5.0: extending the scope and power of comparative and functional genomics in plants. Nucleic Acids Res. 2022;50:D1468–D1474. doi: 10.1093/nar/gkab1024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Van De Velde J., Heyndrickx K.S., Vandepoele K. Inference of transcriptional networks in Arabidopsis through conserved noncoding sequence analysis. Plant Cell. 2014;26:2729–2745. doi: 10.1105/tpc.114.127001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Vaneechoutte D., Vandepoele K. Curse: Building expression atlases and co-expression networks from public RNA-Seq data. Bioinformatics. 2019;35:2880–2881. doi: 10.1093/bioinformatics/bty1052. [DOI] [PubMed] [Google Scholar]
  102. Wang C.-T., Ru J.-N., Liu Y.-W., Yang J.-F., Li M., Xu Z.-S., Fu J.-D. The Maize WRKY Transcription Factor ZmWRKY40 Confers Drought Resistance in Transgenic Arabidopsis. Int. J. Mol. Sci. 2018;19:2580. doi: 10.3390/ijms19092580. [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Wang T., Zeng J., Lowe C.B., Sellers R.G., Salama S.R., Yang M., Burgess S.M., Brachmann R.K., Haussler D. Species-specific endogenous retroviruses shape the transcriptional network of the human tumor suppressor protein p53. Proc. Natl. Acad. Sci. USA. 2007;104:18613–18618. doi: 10.1073/pnas.0703637104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Wang X., Bai L., Bryant G.O., Ptashne M. Nucleosomes and the accessibility problem. Trends Genet. 2011;27:487–492. doi: 10.1016/j.tig.2011.09.001. [DOI] [PubMed] [Google Scholar]
  105. Weintraub H., Groudine M. Chromosomal Subunits in Active Genes Have an Altered Conformation. Science. 1976;193:848–856. doi: 10.1126/science.948749. [DOI] [PubMed] [Google Scholar]
  106. Wicker T., Sabot F., Hua-Van A., Bennetzen J.L., Capy P., Chalhoub B., Flavell A., Leroy P., Morgante M., Panaud O., et al. A unified classification system for eukaryotic transposable elements. Nat. Rev. Genet. 2007;8:973–982. doi: 10.1038/nrg2165. [DOI] [PubMed] [Google Scholar]
  107. Wilkins O., Hafemeister C., Plessis A., Holloway-Phillips M.-M., Pham G.M., Nicotra A.B., Gregorio G.B., Jagadish S.V.K., Septiningsih E.M., Bonneau R., Purugganan M. EGRINs (Environmental Gene Regulatory Influence Networks) in Rice That Function in the Response to Water Deficit, High Temperature, and Agricultural Environments. Plant Cell. 2016;28:2365–2384. doi: 10.1105/tpc.16.00158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Woodhouse M.R., Cannon E.K., Portwood J.L., Harper L.C., Gardiner J.M., Schaeffer M.L., Andorf C.M. A pan-genomic approach to genome databases using maize as a model system. BMC Plant Biol. 2021;21:385. doi: 10.1186/s12870-021-03173-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. Wu Y., Johnson L., Song B., Romay C., Stitzer M., Siepel A., Buckler E., Scheben A. A multiple alignment workflow shows the effect of repeat masking and parameter tuning on alignment in plants. Plant Genome. 2022;15 doi: 10.1002/tpg2.20204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Yoo C.Y., Pence H.E., Jin J.B., Miura K., Gosney M.J., Hasegawa P.M., Mickelbart M.V. The Arabidopsis GTL1 Transcription Factor Regulates Water Use Efficiency and Drought Tolerance by Modulating Stomatal Density via Transrepression of SDD1. Plant Cell. 2010;22:4128–4141. doi: 10.1105/tpc.110.078691. [DOI] [PMC free article] [PubMed] [Google Scholar]
  111. Yuan G.-C., Liu Y.-J., Dion M.F., Slack M.D., Wu L.F., Altschuler S.J., Rando O.J. Genome-Scale Identification of Nucleosome Positions in S. cerevisiae. Science. 2005;309:626–630. doi: 10.1126/science.1112178. [DOI] [PubMed] [Google Scholar]
  112. Zander M., Lewsey M.G., Clark N.M., Yin L., Bartlett A., Saldierna Guzmán J.P., Hann E., Langford A.E., Jow B., Wise A., et al. Integrated multi-omics framework of the plant response to jasmonic acid. Nat. Plants. 2020;6:290–302. doi: 10.1038/s41477-020-0605-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  113. Zhao H., Zhang W., Chen L., Wang L., Marand A.P., Wu Y., Jiang J. Proliferation of Regulatory DNA Elements Derived from Transposable Elements in the Maize Genome. Plant Physiol. 2018;176:2789–2803. doi: 10.1104/pp.17.01467. [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Zhao L., Yan J., Xiang Y., Sun Y., Zhang A. ZmWRKY104 Transcription Factor Phosphorylated by ZmMPK6 Functioning in ABA-Induced Antioxidant Defense and Enhance Drought Tolerance in Maize. Biology. 2021;10:893. doi: 10.3390/biology10090893. [DOI] [PMC free article] [PubMed] [Google Scholar]
  115. Zhou P., Li Z., Magnusson E., Gomez Cano F., Crisp P.A., Noshay J.M., Grotewold E., Hirsch C.N., Briggs S.P., Springer N.M. Meta gene regulatory networks in maize highlight functionally relevant regulatory interactions. Plant Cell. 2020;32:1377–1396. doi: 10.1105/tpc.20.00080. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Supplemental Figures 1–10 and Supplemental Datasets 1–3

Supplemental Datasets: files containing the genomic coordinates of the CNSs (BLSSpeller, msa_pipeline, FunTFBS, Song-2021, and Conservatory), the accessible chromatin region compendium (ACRs), and the unmethylated regions (UMRs) used to generate the iCREs, along with both the “all iCRE” and “max F1 iCRE” sets, are available at https://doi.org/10.5281/zenodo.15143951. These datasets are provided for both versions 4 and 5 of the maize genome.

mmc1.pdf (4MB, pdf)
Data S1. Supplemental Tables 1–8
mmc2.xlsx (192.3KB, xlsx)
Document S2. Article plus supplemental information
mmc3.pdf (7.4MB, pdf)

Articles from Plant Communications are provided here courtesy of Elsevier

RESOURCES