Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2026 Mar 18;125(6):e70774. doi: 10.1111/tpj.70774

Insights into glandular trichome biology from analysis of organ‐specific gene expression programmes in cannabis, hop and tomato

Muluneh Tamiru‐Oli 1,2,3,#, Bhavna Hurgobin 1,2,3,#, Oliver Berkowitz 1,2,3, Lingling Yin 1,2, Ricarda Jost 1,2,3, Sofya Gvaramiya 1,2,3, Matthew Welling 1,2,3, Monika S Doblin 1,2,3, Antony Bacic 1,2,3, James Whelan 4, Mathew G Lewsey 1,2,3,5,
PMCID: PMC12999219  PMID: 41849781

SUMMARY

Glandular trichomes (GTs) are epidermal outgrowths in which diverse specialised (secondary) metabolites are synthesised and stored. Cannabis (Cannabis sativa L.) and its close relative hop (Humulus lupulus L.) have pharmaceutical and industrial significance due to the presence of these metabolites in their GTs. We examined the conservation or divergence of the specific transcriptional programmes underlying GT biology. To achieve this, we generated transcriptome atlases of trichomes, flower, leaf, stem and root for cannabis, hop and tomato. We found that 12.9, 10.1 and 16.8% of cannabis, hop and tomato genes, respectively, were expressed organ/tissue specifically across all organs/tissues. Transcription factors (TFs) on average accounted for 7.5% of the organ‐specific transcriptome and likely regulate organ‐specific functions. We also conducted weighted gene co‐expression network analysis and gene regulatory network (GRN) analysis to identify key regulators of GT function across the species and validated our predictions by DNA affinity purification sequencing for a subset of the cannabis and tomato GT TFs. The GRNs specific to cannabis or hop GTs were enriched for TFs and target genes associated with specialised metabolism, reflecting their species‐specific nature. Conversely, the shared GRN components (identified via orthology analysis) were involved in highly conserved processes, such as flavonoid biosynthesis, solute transport and metabolite storage. Together, these GRNs and the associated transcriptome atlases are valuable resources to improve our knowledge of GT function and organ‐specific genome regulation.

Keywords: gene regulatory networks, glandular trichomes, hierarchical orthogroups, specialised metabolites, organ‐specific gene expression, transcription factors, Cannabis sativa , Humulus lupulus , Solanum lycopersicum

Significance Statement

This study of three glandular trichome‐producing species (cannabis, hop and tomato) defines tissue‐specific transcriptomic atlases for five organs (trichome, flower, leaf, stem and root), providing co‐expression and gene regulatory networks and identifying candidate regulators of specialised metabolism in trichomes of each species. A subset of the putative cannabis and tomato trichome regulators is independently verified. These resources provide a roadmap to understand, improve and modify plant metabolites that have valuable medicinal and industrial applications.

graphic file with name TPJ-125-0-g006.jpg

INTRODUCTION

Trichomes are diverse specialised structures that vary significantly in morphology and composition both within and among species. These structures play several key roles in plants, including protection against UV light, herbivores and pathogens, temperature regulation and serving as a rich source of phytochemicals (Wagner et al., 2004). Trichomes are classified as either glandular or non‐glandular depending on the presence or absence of a glandular head. Glandular trichomes (GTs) are multicellular structures that have a remarkable capacity to produce both considerable amounts, and a broad diversity, of specialised (secondary) metabolites (Huchelmann et al., 2017; Wagner et al., 2004). The metabolites are synthesised and stored in the heads of the GTs, which are frequently amenable to harvesting. A subset of these specialised metabolites has significant commercial value.

GTs are produced by approximately 30% of vascular plant species (Fahn, 2000). Among these, the GTs of cannabis (Cannabis sativa L.) and its close relative hop (Humulus lupulus L.) have pharmaceutical, agricultural and industrial significance (Fahn, 2000; Kovalchuk et al., 2020). Cannabis produces >100 cannabinoids, of which the psychoactive Δ9‐tetrahydrocannabinol (THC) and the non‐psychoactive but analgesic, and hence medicinally important, cannabidiol (CBD) are dominant. Cannabis produces other classes of compounds such as terpenes and flavonoids that are mainly responsible for the aroma, flavour and colour of cannabis but may also have therapeutic potential (Lowe et al., 2021). Hop has long been used in folk medicine to treat several health problems, including anxiety, gastric problems, fever and toothache (Korpelainen & Pietiläinen, 2021). However, the plant is predominantly valued for the resins (α‐ and β‐acids), essential oils (mainly composed of terpenoids) and prenylflavonoids produced in the GTs of its female flowers (or cones) that are used as ingredients in the brewing industry (Champagne & Boutry, 2017). The terpenoids are responsible for fragrance/aroma, while the acids contribute to the bitterness and microbial stability of beer (Lin et al., 2021).

The development of non‐GTs has been studied extensively, particularly in the model plant species Arabidopsis thaliana (Pattanaik et al., 2014). However, our knowledge of GT development and the specialised metabolic pathways that operate within GTs is rudimentary. Our current understanding of GTs is largely derived from recent studies of species like tomato (Solanum lycopersicum L.), artemisia (Artemisia annua L.) and tobacco (Nicotiana tabacum L.). These studies have primarily focused on identifying and characterising the genes associated with GT initiation and development, alongside the major biosynthetic pathways active within GTs of these species (Huchelmann et al., 2017; Schuurink & Tissier, 2020). Tomato, in particular, has served as a model system to study GTs. This species produces three types of non‐GTs (types II, III, V) and four types of GTs: capitate (types I and IV) and schizogenous (types VI and VII). Tomato GTs differ based on the number of secretory cells in the glandular head and size of the stalk, with the type VI GTs being the major site for terpene production (McDowell et al., 2011; Schilmiller et al., 2009; Schuurink & Tissier, 2020).

Cannabis produces three types of GTs: capitate‐stalked (long‐stalked with a larger globular head), sessile (short‐stalked) and bulbous (small and low), with the capitate‐stalked GTs being the most abundant on female flowers and producing the largest quantity of cannabinoids (Livingston et al., 2021). Cannabis also produces non‐GTs, which are found mainly on stems, leaves, petioles and bracts. Hop produces two types of GTs: peltate GTs called lupulin glands and bulbous GTs, both found on bracteoles of female inflorescence and on leaves (Sugiyama et al., 2006). The lupulin glands have large multicellular heads and are vital in the biosynthesis and storage of the key hop specialised metabolites (Patzak et al., 2015). Hop non‐GTs are abundant on the abaxial surface of the bracts and serve as protective structures (Campos et al., 2023). Both cannabis and hop are predominantly dioecious species with separate male and female plants, and their GTs are mostly concentrated on female inflorescences (Kovalchuk et al., 2020).

Trichomes in cannabis have been poorly studied because licit cultivation and research were restricted until recently, due to the species' classification as an illegal narcotic under the UN Convention on Narcotic Drugs (1961) and consequent strict global prohibition. The recent reclassification by the UN and the subsequent legalisation of cannabis in several countries and jurisdictions for medicinal and recreational use has provided the opportunity for deeper research on this interesting species (Pattnaik et al., 2022). Much work is also still needed to better understand how trichome development and trichome‐specific metabolic pathways are regulated in hop (Patzak et al., 2021). To increase the yield of specialised metabolites in both species, efforts must focus on developing cultivars with improved flower yield and greater trichome number and size. Additionally, it is necessary to enhance the abundance and/or activity of enzymes involved in the major specialised metabolic pathways. To this end, basic research is required to identify and characterise the key genes involved in these processes and understand their regulatory mechanisms with the potential to increase crop performance.

Higher plants are composed of multiple organ and tissue types that arise primarily from spatially and temporally regulated dynamic gene expression patterns (Fasoli et al., 2012; Kamenetsky et al., 2015). Dynamic changes in gene expression patterns are also assumed to be responsible for the broad phenotypic and functional diversity in plants. Consequently, knowledge of gene expression differences between plant tissues/organs is fundamental to understanding their development and function. Analysis of gene expression patterns also enables the identification of organ‐specific promoters. Such promoters are useful in biotechnology to direct gene expression to the organ/tissue of interest, allowing targeted manipulation of the expression levels of key genes while minimising unwanted off‐target effects associated with the use of constitutive promoters (Jeong & Jung, 2015).

Organ‐specific gene expression is achieved through the activity of transcription factors (TFs) (Chen & Rajewsky, 2007). TFs are proteins that bind to specific DNA sequences (motifs) in the promoters of the target genes that they regulate. Individual TFs may bind to tens, hundreds or thousands of genes, potentially influencing the expression of all (Chang et al., 2013; Song et al., 2016; Zander et al., 2020). Furthermore, plant genomes encode thousands of TFs, and individual genes are usually regulated by the combinatorial effect of multiple TFs. Consequently, the dynamic gene expression underlying any developmental process or environmental response may involve the co‐ordinated activity of many TFs. The full set of regulatory interactions between TFs and their target genes forms a gene regulatory network (GRN), in which regulators of gene expression (mainly TFs) are connected to target gene nodes by interaction edges (Van den Broeck et al., 2020). Individual TFs have different roles within the network, potentially regulating different subsets of target genes and their associated biological functions (Chang et al., 2013, Song et al., 2016, Zander et al., 2020). In plants, individual organs are complex structures composed of several specialised tissues and cell types and often involve different transcriptional regulations controlling development, size, shape and function. The information that GRNs provide about TF–target gene interactions enables the identification of key regulators within the GRNs that, if functionally validated, can be targeted for modification to improve traits of interest.

Although organ‐specific gene expression has been widely used to study organ development and function in plants, few studies have attempted to investigate these processes across different species (Huang & Schiefelbein, 2015; Julca et al., 2021). In this work, we systematically studied transcriptional regulation specific to GTs of the commercially valuable species cannabis and hop. To enable this, we generated and compared transcriptomics data from five main organs (root, stem, leaf, flower and trichomes). We generated, in parallel and as a ‘control’ an identical dataset for tomato because it is the most well‐understood model for the study of GT function (Schuurink & Tissier, 2020). We next determined whether the genes expressed organ‐specifically were conserved across species by constructing and assessing hierarchical orthogroups (HOGs) that encompassed all three species. We then used these data to identify shared/conserved and species‐specific co‐expression networks and GRNs across all three species, from which we characterised putative regulatory TFs acting in GTs. The networks defined in this study are an essential step towards improving our understanding of how key processes associated with GTs are regulated. Consequently, these networks and underlying transcriptomics datasets are a valuable resource for understanding how TFs function in GTs generally across species, and how they drive the distinct and valuable biology of individual species.

RESULTS

Identification of organ/tissue‐specific genes

Our primary aims were to map organ specificity of gene expression in different GT‐producing species, compare gene expression programmes between these species and determine how these programmes might be involved in species‐specific specialised metabolism. We conducted transcriptome analysis of trichome, flower, leaf, stem and root samples from the GT‐producing species cannabis, hop and tomato (Figure 1a; Figures S1 and S2). We next used the Tau (τ) specificity index (ranging from 0 for ubiquitous to 1 for highly specific expression) to estimate the number of organ/tissue‐specific genes in each species (Kryuchkova‐Mostacci & Robinson‐Rechavi, 2017) (Tables S1–S3). We defined an organ‐specific gene as one meeting two criteria: a τ >0.80 and a mean TPM count of >0.5 across the biological replicates of the corresponding organ. As expected, organ‐specific gene expression was a common feature among the studied species, although the number of organ‐specific genes identified varied between species ranging from 317 (stem) to 1616 (root) in cannabis, 758 (trichome) to 1416 (flower) in hop and 825 (leaf) to 2065 (flower) in tomato (Figure 1b,c; Table S4). This might be due to variation in the composition of tissue‐ and cell‐types and/or ontogenesis of the organs sampled among these plant taxa. On average, 12.9, 10.1 and 16.8% of the cannabis, hop and tomato genes, respectively, were identified as organ‐specific in this study. TFs, a proportion of which would be involved in the regulation of organ‐specific functions, accounted for 5.9 (hop), 8.3 (cannabis) and 8.4% (tomato) of the organ‐specific transcriptome (Figure 1c; Table S4). These transcriptome data can be visualised in an Electronic Fluorescent Pictograph (eFP) browser for analysis by researchers interested in organ‐specific functions (Figure S3; see Data Availability Statement).

Figure 1.

Figure 1

Organ‐specific genes identified by RNA‐Seq.

(a) Schematic of the organ types used for RNA‐seq of cannabis, hop and tomato. Scale bar = 200 μm.

(b) Heatmaps showing expression of organ‐specific genes (red: high expression; blue: low expression) across species. Z‐score of log2‐normalised transcript per million (TPM) mapped reads was used to construct the heatmaps.

(c) The number of organ‐specific genes, including transcription factors (TFs), identified in cannabis, hop and tomato.

To assess the quality and robustness of our organ‐specific datasets, we performed cross‐validation using independent public transcriptome data (see Methods; Table S5). Organ‐specific gene expression should substantially correlate across different studies, reflecting conserved biological functions, despite technical and biological variation expected to arise from differences in growth and environmental conditions, developmental stages, tissue sampling methods and sequencing platforms. Cannabis samples had consistently strong correlations across datasets for all organs (mean Spearman ρ ≥0.7; Figures S4 and S5). Tomato datasets exhibited strong correlations for trichomes, stems and flowers (mean Spearman ρ ≥0.7), with moderate correlations for roots and leaves (mean Spearman ρ = 0.5–0.6; Figures S6 and S7). For hop, strong correlations were observed for leaves and trichomes (mean Spearman ρ ≥0.7), whereas no suitable external datasets were available for flower, stems and roots (Figure S8). Overall, these results support the robustness of our organ‐specific expression datasets and validate our approach in delineating organ‐specific genes.

A smaller number of trichomes would have been present in the flower, stem and leaf samples. Consequently, we confirmed that these trichomes did not compromise our identification of GT‐specific genes. For cannabis, we generated a new transcriptome dataset from GT‐depleted flowers and performed a τ analysis comparing gene expression across five organs (GT‐depleted flower, trichome, leaf, stem and root), which identified 895 GT‐specific genes. Of these, 870 (97.2%) genes were shared with the GT‐specific genes we identified from our original τ analysis comparing transcriptomes of intact flower, trichome, leaf, stem and root, demonstrating robust concordance despite the presence of GTs on intact flowers (Figure S9; Table S6). For tomato, we used an independent dataset that compared isolated GTs to GT‐removed leaves, identifying 2258 GT‐upregulated genes (log2FC ≥1) when remapped to the improved tomato genome (SL4.0) (Balcke et al., 2017) (see Methods; Table S7). We then filtered this gene set for organ specificity using our τ based organ specificity criteria (τ >0.80 and mean TPM >0.5 across five organs: trichome, flower, leaf, stem and root), identifying 789 (35%) genes meeting both criteria. Of these, 436 (55%) genes overlapped with our independently identified set of 1227 GT‐specific genes (Figure S10a; Table S7). These shared GT‐specific genes were enriched primarily for functional terms related to specialised metabolism, as expected (Figure S10b; Table S7). The remaining 353 τ‐classified genes from the Balcke dataset were specific to other organs: flower (202, 25.6%), root (116, 14.7%), leaf (19, 2.4%) and stem‐specific (16, 2.0%) (Figure S10a). This distribution reveals an important methodological distinction: while Balcke et al.'s differential expression analysis results identified genes more abundant in GTs than leaves, our five‐organ τ analysis revealed that many of these genes are actually expressed across multiple organs and thus not GT‐specific. The remaining 1469 Balcke GT‐upregulated genes (65%) did not meet our organ specificity thresholds (τ >0.80 and mean TPM >0.5), likely because they are expressed across multiple organs at moderate levels rather than being highly restricted to one tissue. This comparison demonstrates that differential expression between two tissues identifies genes enriched in one tissue, whereas τ analysis across multiple organs identifies genes with expression predominantly restricted to a single organ, and this is a more stringent criterion for tissue specificity. In hop, we used quantitative reverse transcription‐polymerase chain reaction (qRT‐PCR) to compare the expression of a subset of TFs in isolated trichomes versus trichome‐removed flower samples. The analysis confirmed the candidate TFs were either GT‐ or flower‐specific, supporting our organ‐specific gene expression defined using transcriptome‐based τ analysis (Figure S11). Collectively, our data indicate that the presence of trichomes on the organs did not significantly affect the identification of GT genes.

Functional characteristics of organ‐specific genes

We also used gene ontology (GO) enrichment analysis to support our organ‐specific gene expression analysis, comparing the enriched GO terms to known functions of those organs. The top GO terms in the biological process category were broadly similar across the three species (Figures S12–S14). As expected, the stem‐specific genes were significantly enriched (Benjamini–Hochberg test; adjusted P‐value <0.05) for GO terms associated with cell wall organisation and biogenesis of this elongating tissue. Similarly, biological processes related to stress responses and responses to external stimuli such as hydrogen peroxide metabolic/catabolic processes, peptidyl‐tyrosine modification and cellular oxidant detoxification were enriched amongst the root‐specific genes. Photosynthesis‐associated GO terms were enriched in leaf‐specific genes, which aligns with the leaf's primary role in carbon fixation.

The genes specifically expressed in GTs were primarily enriched for species‐specific GO terms, likely reflecting functional divergence in GT roles across species (Figures S12–S14). Genes expressed in cannabis GTs were significantly enriched in GO terms associated mainly with ‘flavonoid biosynthetic process’ (P‐adjusted: 2.06e−15), ‘fatty acid metabolic processes’ (P‐adjusted: 3.94e−14), ‘monocarboxylic acid metabolic process’ (P‐adjusted: 9.31e−11) and ‘secondary metabolite biosynthetic process’ (P‐adjusted: 7.47e−10) consistent with the main specialised metabolic processes operating in cannabis GTs (Figure S12) (Gonçalves et al., 2019). Similarly, there were several GO terms overrepresented in hop GT‐specific genes including ‘isoprenoid biosynthetic process’ (P‐adjusted: 4.33e−18), ‘isoprenoid metabolic process’ (P‐adjusted: 3.14e−16), ‘terpenoid biosynthetic process’ (P‐adjusted: 5.76e−14), ‘terpenoid metabolic process’ (P‐adjusted: 1.92e−13) and ‘amino acid catabolic processes’ (P‐adjusted: 1.52e−10) (Figure S13). Terpenes and bitter acids are the major specialised metabolites in hop GTs, and bitter acids are synthesised from precursors derived from the degradation of branched‐chain amino acids (Clark et al., 2013; Xu et al., 2013).

As expected, the genes expressed in tomato GTs were significantly enriched (Benjamini–Hochberg test; P‐adjusted <0.05) for GO terms associated with fatty acid, flavonoid and isoprenoid biosynthetic processes (Figure S14). Interestingly, the tomato GT‐specific genes were also enriched for photosynthesis and light harvesting‐related GO terms (Benjamini–Hochberg test; P‐adjusted <0.05). This observation supports previous reports that tomato GTs are photosynthetically active, with the energy and reducing power (ATP and NADPH) generated by photosynthesis in the GTs predominantly fuelling the high‐flux specialised metabolite biosynthesis (Balcke et al., 2017; Saadat et al., 2023). This enrichment initially appeared counterintuitive, given that our comparison included photosynthetically active leaves. We hypothesised that the mature leaf tissue we sampled would have reduced photosynthetic gene expression relative to GTs. To test this, we performed differential expression analysis comparing mature leaves and young leaves from plants of the same developmental stage. This analysis identified 4238 differentially expressed genes (|log2 fold change| ≥1, Benjamini–Hochberg adjusted P‐value <0.05), revealing a clear developmental transition between leaf stages (Figure S15; Table S8). Genes upregulated in mature leaves (1251 genes, 29.5%) were enriched for ageing and defence‐related processes, including peptidyl‐tyrosine modification, ethylene signalling and salicylic acid metabolism, indicating a developmental shift toward senescence (Huang et al., 2003; Wang, Dai, et al., 2021). Conversely, genes downregulated in mature leaves (2987 genes, 70.5%) were enriched for active growth processes, including cell cycle progression, cell division, photosynthesis and cell wall biogenesis, functions characteristic of actively growing young leaves. We also used τ analysis to identify GT‐specific genes by independently comparing the transcriptomes of mature leaves or young leaves to those of GT, flower, stem and root (see Methods). Our τ analysis involving young leaves identified 760 GT‐specific genes, with substantial overlap (739 genes; 97.2%) with those identified with the analysis using mature leaves (Figure S16a; Table S9). The shared GT‐gene set was enriched for primary and specialised metabolism, as expected (Figure S16b). We also identified 488 additional GT‐specific genes by τ analysis involving the mature leaves that were also highly expressed in GTs but failed to meet the tau threshold when analysed using young leaves due to their elevated expression in young leaf tissue. Notably, photosynthesis‐related GO terms were exclusively enriched among GT‐specific genes identified using the mature leaf transcriptome, indicating that the mature leaves sampled had reduced photosynthetic activity (Figures S16a and S17). Consequently, photosynthesis gene expression in GTs exceeded that in developmentally older leaves, resulting in their classification as GT‐specific. This finding reflects biological differences in photosynthetic capacity across leaf developmental stages rather than contamination during trichome isolation.

Organ‐specific gene expression is partially conserved across species

Organ‐specific gene expression was a common feature across all three species analysed (Figure 1b,c). To determine the conservation of organ‐specific gene expression across the three species, we first constructed HOGs (see Methods). A HOG represents a set of protein‐coding genes in the studied species that have descended from a single gene in the last common ancestor (Emms & Kelly, 2019). Genes within a HOG tend to retain similar functions in different species (Gabaldón & Koonin, 2013). Consequently, the identification of HOGs provides a useful measure of the degree to which organ‐specific gene expression is conserved across species, enabling us to predict the conservation or divergence of biological processes (Julca et al., 2022). Overall, we found that organ‐specific gene expression was both species‐specific and conserved across species. We identified a total of 20 189 HOGs (Table S10). Within the HOGs, we found that cannabis had the largest percentage of shared genes (protein‐coding) across two or more species (22 878; 90.4%), followed by hop (23 340; 65.6%) and tomato (19 291; 56.6%) (Figure S18a). On the other hand, tomato had the highest percentage of species‐specific genes (14 784; 43.4%), followed by hop (12 242; 34.4%) and cannabis (2418; 9.56%) (Figure S18b). The greater percentage of species‐specific genes in tomato reflects its more distant phylogenetic relationship to cannabis and hop.

Next, we investigated whether the observed organ‐specific gene expression patterns, as identified in the previous section, were conserved across all three species or were unique to a single species (Figure 1c; Tables S10 and S11). Cannabis expressed the highest percentage of organ‐specific, protein‐coding genes shared by two or more species (3310; 86.1%), followed by hop (3744; 64.6%) and then tomato (3620; 54.1%) (Figure S18c). We also found that tomato had the largest percentage of genes expressed organ‐specifically uniquely in that species (3066; 45.8%), followed by hop (924; 15.9%) and cannabis (250; 6.5%) (Figure S18d). The root exhibited the highest percentage of genes with organ‐specific expression shared across species (3206; 70.3%), while flowers had the highest percentage of genes expressed species‐specifically (1297; 32.4%) (Table S11).

GT‐expressed genes have features shared across species

While plant specialised metabolic pathways often share common precursors and enzyme families, the resulting pathways and metabolites are often species‐specific, suggesting GT‐expressed genes can contribute to both species‐specific and conserved aspects of GT function (Chezem & Clay, 2016; Ono & Murata, 2023). To test this possibility, we investigated the GT‐specifically expressed genes in more detail (Figure 2; Table S12). We identified 2593 protein‐coding GT‐specific genes that were either shared or species‐specific within the HOGs. Of these, 1308 (50.4%) were shared across the three species (Figure 2a). Many GT‐specific genes were also shared between cannabis and hop (503; 19.4%), more than the number shared between cannabis and tomato (23; 0.9%) or hop and tomato (66; 2.5%) (Figure 2a). Additionally, 500 (19.3%) GT‐specific genes were unique to tomato, which is notably higher than the number of GT genes specific to cannabis (78; 3.0%) or hop (115; 4.4%) (Figure 2a). A similar result was obtained for all the remaining organs including flower, leaf, root and stem (Table S12). The number of shared organ‐specific genes reflected phylogenetic distances between the studied species, as would be expected.

Figure 2.

Figure 2

Functional annotation of conserved and species‐specific glandular trichome (GT)‐expressed genes.

(a) UpSet plot showing the number of conserved and species‐specific GT genes.

(b) The top significantly enriched gene ontology (GO) terms (biological processes category; adjusted P‐value <0.05) associated with the conserved and species‐specific GT genes. Dot colour and size represent the significance (−log10 adjusted P‐value) and GeneRatio (proportion of genes in the enriched GO term out of the total annotated genes), respectively. For gene sets including Cannabis (‘Cannabis, Hop & Tomato’, ‘Cannabis & Hop’ and ‘Cannabis & Tomato’), GO terms were based on cannabis genes. For the ‘Hop & Tomato’ set, GO terms were based on hop genes. E4P, erythrose 4‐phosphate; PEP, phosphoenolpyruvate.

We examined the functions of the GT‐specific genes that were shared across the three species by analysing their enriched GO terms (Figure 2b; Table S11). The shared GT‐specific genes were significantly enriched for GO terms related to both specialised (flavonoids, phenylpropanoid and secondary metabolic/biosynthetic processes) and primary (fatty acid metabolic/biosynthetic processes and monocarboxylic acid metabolic processes) metabolism (Benjamini–Hochberg test; adjusted P‐value <0.05) (Figure 2b). Notably, this group contained several genes with known GT‐related roles in tomato, including two chalcone synthases (Solyc09g091510.3.1/SlCHS1 and Solyc05g053550.3.1/SlCHS2), a chalcone flavanone isomerase (Solyc05g052240.3.1/SlCHIL) and an anthocyanidin 3‐O‐glucosyltransferase (Solyc09g059170.3.1/An3GT) involved in the biosynthesis of flavonoid biosynthesis, which have a role in modulating terpene biosynthesis and providing antioxidant protection (Balcke et al., 2017; Kang et al., 2014; Lv et al., 2022; O'Neill et al., 1990; Sugimoto et al., 2022; Tohge et al., 2017; Wang, Li, et al., 2021). A number of putative flavonoid biosynthesis cannabis genes including LOC115702709 (flavanone 3‐dioxygenase‐like), LOC115709313 (flavonoid 3′‐monooxygenase), LOC115709933 (flavonoid 3′‐monooxygenase‐like), LOC115712997 (naringenin,2‐oxoglutarate 3‐dioxygenase‐like) and LOC115724170 (naringenin‐chalcone synthase) were also identified (Table S5) (Gagalova et al., 2024). Additionally, several chalcone isomerase‐like genes from hop (LOC133817220, LOC133817221, LOC133832713 and LOC133798149) were present. Flavonoid biosynthesis is conserved in plants including in cannabis, hop and tomato GTs, and this process modulates other processes, including terpene biosynthesis in tomato (Champagne & Boutry, 2017; Davies et al., 2024; Kang et al., 2014; Sugimoto et al., 2022).

The shared GT genes also included several putative ABC transport family genes implicated in solute transport across cannabis, hop and tomato (Table S10). Both Solyc08g076720.4.1 and LOC115716265 were implicated in the transport of specialised metabolites, and we have recently shown that LOC115716265 is predominantly expressed in GTs and that this expression is associated with the GT‐enriched transcription‐associated histone marks H3K4me3 and H3K56ac (Conneely et al., 2024; Lashbrooke et al., 2015; Paul et al., 2016; Petit et al., 2016). TFs with known roles in GT biology, such as Solyc03g098200.4.1 (HD‐ZIP IV/SlHD8, a mediator of JA‐induced trichome elongation), Solyc09g008810.3.1 (HD‐ZIP I/II, and homologue of the cucumber homeodomain–leucine zipper I gene GLABROUS 1, CsGL1, that is required for trichome formation) and Solyc10g005330.3.1 (a HD‐ZIP IV protein MERISTEM L1 and homologue of the Arabidopsis MERISTEM LAYER 1, AtML1, that is involved in trichome differentiation) were among the shared GT genes (Filippis et al., 2013; Hua, Chang, Xu, et al., 2021; Li et al., 2015; Nakamura et al., 2006). This group also contained several genes involved in lipid metabolism including two glycerol‐3‐phosphate acyltransferases (Solyc01g094700.4.1/SlGPAT4 and Solyc09g014350.3.1/SlGPAT6) and a fatty acyl omega‐hydroxylase (Solyc01g094750.4.1/CYP86A68) involved in cutin biosynthesis and deposition and epidermal cell formation, suggesting their role in the initiation and morphogenesis of trichomes.

Given the close phylogenetic relationship between cannabis and hop, we next examined the GT‐specific genes and related processes that were shared between these two species. GO enrichment analysis identified genes that were significantly associated mainly with terms involved in primary (aminoglycan, chitin, amino sugar, glucosamine‐containing compounds and polysaccharide) metabolic and catabolic processes, suggesting the possible role of these GTs in plant defence responses (Benjamini–Hochberg test; P‐adjusted <0.05) (Figure 2b). The GT genes shared between the two species included, amongst others, a putative chitinase 5‐like gene (LOC115698289), a class V chitinase‐like gene (LOC115724844), an endochitinase EP3‐like gene (LOC115708023) and two putative acidic endochitinase‐like genes (LOC115717308 and LOC115717353). Several putative pectinesterase/like genes (LOC115706710, LOC115707063, LOC115704833, LOC115706869 and LOC115708030), which are likely involved in cell wall modification‐related processes, were also present. Additionally, we identified putative shikimate O‐hydroxycinnamoyltransferase‐like genes (LOC115704640, LOC115707961, LOC115707958, LOC115719465, LOC115721021 and LOC115721427) likely associated with the phenylpropanoid metabolic pathway (Vogt, 2010). Interestingly, the GT genes shared between the two species also included several putative secondary metabolism genes including branched‐chain amino acid aminotransferase 1 genes (LOC115715790, LOC115716705, LOC133834633 and LOC115712735) with predicted roles in amino acid metabolism and prenyltransferase‐like genes (LOC115713171, LOC115713185, LOC115713205, LOC115713148, LOC115707809, LOC115713185, LOC115722991 and LOC115713215), of which LOC115713185 is a functional cannabigerolic acid synthase associated with the cannabinoid biosynthetic pathway (Eriksen et al., 2021; Luo et al., 2019). Our observation suggests that while genes for secondary metabolite biosynthesis, like cannabinoids in cannabis and bitter acids in hops, are largely conserved between the two species, some of these genes may have evolved distinct functions, leading to different metabolite profiles (Padgitt‐Cobb et al., 2023; van Velzen & Schranz, 2021).

We next identified which GT processes were broadly conserved by investigating functions of the cannabis and hop GT genes that were shared with the more distantly related species tomato. The GT genes that were conserved between cannabis and tomato were significantly enriched in GO terms only for fatty acid biosynthetic process (P‐adjusted <0.03), monocarboxylic acid biosynthetic process (P‐adjusted <0.03), fatty acid metabolic process (P‐adjusted <0.03), unsaturated fatty acid biosynthetic process (P‐adjusted <0.03) and unsaturated fatty acid metabolic process (P‐adjusted <0.03) (Figure 2b). This group included several genes with predicted roles in lipid metabolism both in cannabis (LOC115719000 and LOC115719329, delta(12)‐fatty‐acid desaturase FAD2‐like genes; LOC115719251, delta(12)‐oleate desaturase‐like) and tomato (Solyc06g054685.1.1, Acyl‐[acyl‐carrier‐protein] desaturase). Similarly, the GT‐specific genes shared between hop and tomato were significantly enriched (P‐adjusted <0.05) in GO terms associated with polysaccharide (xyloglucan, hemicellulose and glucan) metabolic processes and cell wall organisation or biogenesis (Figure 2b).

The species‐specific GT genes reflected GT functions specific to each species

To better understand the species‐specific processes that the GT genes are involved in, we examined more closely the species‐specific GT genes (identified via HOG analysis; Table S11). The cannabis GT‐specific genes were significantly enriched with GO terms linked to specialised metabolism, benzene‐ and phenol‐containing compound metabolic process and polyketide biosynthetic/metabolic process (Benjamini–Hochberg test; P‐adjusted <0.05), consistent with the major biosynthetic pathways operating in cannabis (Figure 2b). Cannabinoids are prenylated polyketides, meaning they are synthesised through a pathway involving both polyketide and isoprenoid precursors (Stout et al., 2012). As expected, this group contained two putative polyketide synthase 4 genes (LOC115699293 and LOC115700696), two copies of the predicted olivetol synthase 1 gene and two putative olivetolic acid cyclase (LOC115723437 and LOC115723438) (Grassa et al., 2021; Innes & Vergara, 2023).

Among the most significantly enriched GO terms associated with the hop GT‐specific genes were ‘isoprenoid biosynthetic process’ (P‐adjusted: 3.04e−15), ‘isoprenoid metabolic process’ (P‐adjusted: 1.99e−14), ‘L‐phenylalanine catabolic process’ (P‐adjusted: 3.22e−13), ‘aromatic amino acid family catabolic process’ (P‐adjusted: 2.67e−12) and ‘L‐amino acid catabolic process’ (P‐adjusted: 7.86e−12) (Figure 2b). This is consistent with the fact that terpenoids, phenolic compounds and bitter acids are the dominant compounds produced in hop GTs (Champagne & Boutry, 2017; Goese et al., 1999). Several putative terpenoid biosynthesis genes including alpha‐humulene synthases (LOC133781387, LOC133781390 and LOC133781637), Germacrene‐A synthases (LOC133781388 and LOC133781391) and putative Isopentenyl‐diphosphate Delta‐isomerase 1 genes (LOC133779814, LOC133824175, LOC133825509, LOC133829026, LOC133829202 and LOC133831140), and putative phenolic compounds biosynthesis genes including phenylalanine ammonia‐lyase genes (LOC133801588, LOC133801590, LOC133801594, LOC133801595, LOC133801597, LOC133801598, LOC133801599 and LOC133801601) were identified.

The tomato GT‐specific genes were significantly enriched with GO terms associated with regulation and/or negative regulation of peptidase, endopeptidase and hydrolase activities and isoprenoid biosynthetic/metabolic processes, suggesting their possible roles in responses to external stimuli and in specialised metabolism (Benjamini–Hochberg test; P‐adjusted <0.05) (Figure 2b). These genes included several putative proteinase inhibitors (Solyc09g083435.1.1, Solyc09g084440.2.1, Solyc09g084470.3.1, Solyc09g084490.4.1, Solyc09g089490.4.1, Solyc09g089500.3.1, Solyc09g089530.3.1, Solyc09g089540.4.1, Solyc03g020030.3.1, Solyc03g020040.3.1, Solyc11g021020.1.1 and Solyc11g021060.2.1) with predicted roles in plant defence/stress responses (Barba‐Espín et al., 2025; Meng et al., 2022; Morales et al., 2022). Many putative terpene biosynthesis genes including Solyc01g105890.3.1 (Monoterpene synthase 1), Solyc01g101170.4.1 (Viridiflorene synthase), Solyc07g008690.3.1 (Sesquiterpene synthase), Solyc08g005677.1.1 (Terpene synthase) and Solyc04g056390.3.1 (Isopentenyl‐diphosphate Delta‐isomerase) were also present. TFs known to regulate several key processes in tomato including terpene biosynthesis (Solyc02g080260/WOOLLY) and flavonoid metabolism (Solyc05g013540.1.1, Solyc06g064500.3.1/FOMT3 and Solyc06g083450.4.1/FOMT1) were identified (Ewas et al., 2016; Hua, Chang, Xu, et al., 2021; Mandal et al., 2022; Spyropoulou et al., 2014; Tohge et al., 2020). We further determined that 23.8, 38.8 and 5.5% of the GT‐expressed cannabis, hop and tomato genes, respectively, have no functional annotations (Tables S1–S3). These are interesting targets for further studies that aim to uncover genes with potential new roles in GT‐related functions.

Trichome‐specific gene networks have species‐specific features associated with specialised metabolism

We next determined how gene expression may be regulated in GTs and identified putative regulators of GT‐related processes. GT‐specifically expressed TFs are likely to be key regulators of GT biology. Our GT‐specific transcriptome across all three species contained 241 TFs (60 cannabis, 69 hop and 112 tomato GT‐specific TFs) from 35 TF families (Figures 1c and 3a; Table S4). We assessed the conservation or species‐specificity of the TF families by performing TF family enrichment analysis and identifying overrepresented families expressed within each species, also determining whether enrichment patterns differed significantly among species (see Methods). The HB‐HD‐ZIP, MYB and SRS TF families were significantly enriched in cannabis GTs, while hop GTs were significantly enriched for the AP2/ERF‐AP2 and WRKY TF families (hypergeometric test; P‐value <0.05) (Figure 3a; Table S13). In contrast, tomato GTs exhibited the broadest TF family diversity, with seven enriched families: GRF, HB‐WOX, MYB, OFP and SRS (hypergeometric test; P‐value <0.05) (Figure 3a).

Figure 3.

Figure 3

The distribution and conservation of transcription factors (TFs) associated with glandular trichome (GT)‐specific genes.

(a) Significantly enriched TF families within the three species' GTs (hypergeometric test; P‐value <0.05). The size and colour of each circle represent TF count per family and the enrichment P‐value range (−log10 of the P‐value), respectively.

(b) The number of TFs belonging to different TF families across each of the three species.

(c) The conservation levels of members within each TF family across cannabis, hop and tomato determined by hierarchical orthogroup analysis.

Most enriched TF families represented conserved regulatory programmes rather than species‐specific ones, as determined using pairwise comparisons between species (Table S14). MYB, HB‐HD‐ZIP and SRS families were consistently enriched in both cannabis and tomato without significant inter‐species differences (Fisher's exact test, FDR >0.05), suggesting these families have conserved roles in GT development and function across these lineages. However, enrichment of the AP2/ERF‐ERF family was significantly higher in hop compared with tomato (Fisher's exact test, FDR = 0.015; Table S14), potentially indicating a hop‐specific regulatory mechanism. While WRKY enrichment appeared hop‐specific in the within‐species analysis, this difference did not reach statistical significance in pairwise comparisons after correction for multiple testing (Fisher's exact test, FDR >0.05; Table S14).

Overall, the diversity of TF families (28) in the tomato GT transcriptome was broader than both in cannabis (21) and hop (15) (Figure 3a; Table S13). Nine TF families represented in the cannabis GTs (bZIP, C2C2‐GATA, DBB, FAR1, GARP‐G2‐like, LOB, OFP, PLATZ and TCP) were missing from the hop GTs, and five TF families present in cannabis GTs (AP2/ERF‐AP2, DBB, FAR1, MADS‐MIKC and PLATZ) were absent from the tomato GTs. Although these TF families are missing from the hop and tomato GT transcriptomes, they are all present and annotated in the hop and tomato genomes, respectively (Tables S15 and S16). We also identified three TF families predicted to regulate hop GT‐specific genes (C3H, GRAS and MADS‐M‐type) that were absent from the cannabis GTs but were present in the cannabis genome (Figure 3a; Table S17). This indicates that the difference in TF family composition between the three species' GT transcriptomes principally reflects recruitment of different TF families across species, and not a genome annotation artefact.

Plant TF families are typically large; their differential expansion or contraction across species leads to varying family sizes and potentially different regulatory roles (Shiu et al., 2005). We used HOG analysis to determine how well the members of a given TF family are conserved across the GT transcriptomes of the three species. The MYB, AP2/ERF‐ERF, WRKY, bHLH, HB‐HD‐ZIP and C2H2 TFs were dominant across the GTs of the three species (Figure 3b). Most members of these TF families were conserved across the three species, with AP2/ERF‐ERF (16), WRKY (15), bHLH (15) and MYB (14) family TFs having the most HOGs (Figure 3c). Twenty‐eight HOGs were shared between cannabis and hop, while tomato had the most species‐specific TFs, reflecting its phylogenetic distance from the two species (Figure 3c).

We determined which GT‐expressed TFs may be most influential in regulating GT biology using weighted gene co‐expression network analysis (WGCNA) and GRN analysis (Langfelder & Horvath, 2008). We first applied WGCNA to the organ‐specific genes, identifying organ‐related gene co‐expression modules. A total of 15 118 cannabis, 17 270 hop and 18 869 tomato genes were assigned to 10 (cannabis), 14 (hop), 14 (tomato) co‐expression modules of varying sizes (Figure 4a; Tables S18–S20). In cannabis, three modules (darkmagenta, midnightblue and grey60) were enriched for GT/flower genes as defined by τ analysis. The largest module, darkmagenta (2082 genes), contained 1253 (60.2%) genes that were highly expressed in the GTs, with 601 (48.0%) being GT‐specific (Table S18). Forty‐six (7.7%) of these GT‐specific genes were TFs, predominantly from the MYB (12), AP2/ERF‐ERF (5), B3 (5), bHLH (4), HB‐HD‐ZIP (3), SRS (3) and MADS‐MIKC (3) families. Similarly, the most trichome‐enriched module in hop (darkorange; 3147 genes) contained 1221 (38.8%) genes showing the highest expression in GTs, with 436 (35.7%) being GT‐specific (Table S19). Of these, 32 (7.3%) were TFs mainly from the AP2/ERF‐ERF/AP2 (8), WRKY (7) and MADS‐MIKC (5) families and accounted for more than half of the GT‐specific TFs in this module. In tomato, the largest trichome‐enriched module (brown; 2469 genes) contained 1843 (74.6%) GT genes of which, 723 (39.2%) were GT‐specific. Sixty‐seven (9.3%) of these GT‐specific genes were TFs largely from the MYB (10), bHLH (6), AP2/ERF‐ERF (6), C2H2 (5), WRKY (4) and SRS (3) families (Table S20). It is worth noting that these GT‐enriched modules comprised more than half (darkmagenta: 67.2%; darkorange: 57.5%; brown: 58.9%) of the GT‐specific genes identified in each species.

Figure 4.

Figure 4

Weighted gene co‐expression network analysis (WGCNA) and glandular trichome (GT) gene regulatory networks.

(a) Organ‐specific expression patterns of co‐expression modules identified by WGCNA across all three species. Eigengene expression (ME) of co‐expression modules averaged by organ are shown. Module colour and gene number are shown on the left. Asterisk (*) indicates anova P‐value <0.05 across organs (ME~organ), indicating significant organ‐specific expression patterns.

(b) The top significantly enriched gene ontology (GO) terms (biological processes category; adjusted P‐value <0.05) associated with the main GT modules genes across the three species. Dot colour and size represent the significance and GeneRatio (proportion of genes in the enriched GO term out of the total annotated genes), respectively.

(c) Cannabis, (d) hop and (e) tomato GTs gene regulatory networks. Transcription factors (TFs) and their target genes are represented by coloured and black nodes, respectively. Grey lines indicate the interactions (edges), and TF node size is proportional to the number of interactions. For cannabis and hop gene regulatory networks (GRNs), an edge weight cut‐off of 0.8 was applied and nodes having fewer than two interactions were excluded for ease of visualisation. For the tomato GRN, an edge weight cut‐off of 0.6 was applied and nodes having fewer than 15 interactions were excluded for ease of visualisation.

Overall, the TF composition within the key GT‐specific co‐expression modules reflected the TF profile of the GT‐specific transcriptome across all three species, suggesting both common and species‐specific regulatory mechanisms. Consistent with this, there was strong conservation of specialised metabolic pathways across the three species, determined using GO enrichment analysis of the GT‐specific module genes, with terpenoid/isoprenoid biosynthesis prominently enriched in both cannabis and hop (Figure 4b). The brown module in tomato was notably enriched for defence and stress response pathways alongside chromatin organisation, suggesting epigenetic regulation of specialised metabolism with a strong emphasis on plant defence compounds (Figure 4b).

While WGCNA identifies co‐expression modules based on correlated gene expression patterns, GRNs predict directional regulatory relationships by modelling how TFs control the expression of their downstream target genes (Langfelder & Horvath, 2008). These regulatory interactions are fundamental to understanding gene function, as differences in GRN architecture and activity underpin cell type‐, tissue‐ and organ‐specific functions and developmental programmes (Ó'Maoiléidigh et al., 2014). We separately constructed GT‐specific GRNs for cannabis, hop and tomato using SCION v3.0, then assessed the GRNs both individually and comparatively (Clark et al., 2021) (Tables S21–S23). The GRN of cannabis GTs comprised 60 TFs and 895 target genes, connected by a total of 23 316 edges (i.e. TF–target interactions) (Figure 4c; Table S22). The hop GT GRN consisted of 69 TFs and 758 target genes, with a total of 24 808 TF–target interactions (Figure 4d; Table S9). The tomato GT GRN was the largest with 112 TFs, 1227 target genes and 80 037 edges (Figure 4e; Table S23).

We evaluated TF importance within the GRNs by performing network motif analysis, which identifies recurrent subnetworks and calculates network motif scores (NMS) as a proxy for the influence each gene has on its neighbours (Clark et al., 2019; Milo et al., 2002). Genes with the highest NMS were considered functionally the most important. The top 10% of genes ranked by NMS contained 100, 100 and 86.6% of all TFs in the cannabis, hop and tomato GT GRNs, respectively (Figure 5a–c; Tables S24–S26). Since NMS analysis assesses network topology independently of gene function, this TF enrichment validates our GRN inference and demonstrates that the predicted regulatory architecture reflects genuine biological regulation. The most influential TFs by NMS belonged to distinct family combinations in each species: FAR1, WRKY, HB‐HD‐ZIP, MYB, SRS, MADS‐MIKC and GARP‐G2‐like in cannabis; MADS‐MIKC, WRKY, NAC, MYB, AP2/ERF‐ERF and C2H2 in hop; and HB‐HD‐ZIP, bZIP, C2C2‐Dof, C2C2‐CO‐like, AP2/ERF‐ERF, TCP and GARP‐G2‐like in tomato. TF NMS correlated strongly with the number of TF–target interactions across all three GRNs (Pearson R 2 >0.8; P < 0.001) (Figure 5d–f).

Figure 5.

Figure 5

Transcription factors (TFs) associated with glandular trichome (GT)‐specific gene regulatory networks (GRNs).

(a) Ranking of genes associated with cannabis, (b) hop and (c) tomato GT GRNs based on their network motif score (NMS). The top 35 genes (all of which are TFs) are shown. TFs are colour‐coded according to TF families for ease of visualisation. The correlation between NMS and number of TF–target interactions for all TFs included in the (d) cannabis, (e) hop and (f) tomato GT GRNs.

TFs are known to regulate target genes by binding to specific sequence motifs in their promoter regions (O'Malley et al., 2016; Weirauch et al., 2014). To validate the GRNs, we performed motif enrichment analysis on target gene promoters (2‐kb upstream of TSS), identifying 346, 323 and 319 significantly enriched motifs (E‐value ≤0.05) corresponding to 31, 32 and 34 TF families in cannabis, hop and tomato, respectively (Figure S19a; Tables S27–S29). Enriched motifs included binding sites for several GT‐specific TFs identified using Tau and iTAK annotation (see Methods); these included the AP2/ERF, MYB, bZIP, bHLH and C2C2‐Dof TFs, families known to regulate specialised metabolism and associated cell development in GTs (Figure S19b–d) (Chezem & Clay, 2016; Cao et al., 2020; Li et al., 2022). Together, these findings suggest that both shared and species‐specific TF families regulate GT gene expression across the three species.

Species‐specific GT processes can be examined using species‐specific GRN subnetworks

A subset of TFs and targets were unique to GT GRNs of individual species, as determined by comparing GRNs with the previous HOG analysis, suggesting species‐specific regulatory innovations. We therefore extracted species‐specific subnetworks containing only genes lacking orthologs in the other two species to identify putative regulators of unique GT processes (Figure S20; Tables S30–S32). Overall, species‐specific target genes were predominantly associated with specialised biosynthetic pathways characteristic of each species' GT metabolite profile.

The cannabis‐specific subnetwork contained 6 TFs, 76 targets and 244 interactions (Figure S20a; Table S30). The TFs belonged to the B3 (LOC115723557, LOC115712191, LOC115712352), MYB (LOC115723418/MYB108, LOC115725286/MYB7‐like) and FAR1 (LOC115699944/FAR1‐RELATED SEQUENCE 5‐like) families (Figure S20a). MYB108 has been shown to regulate specialised metabolite biosynthesis in multiple species (Liu et al., 2023; Sun et al., 2019), while the roles of the other TFs in GT processes remain largely uncharacterised. Target genes were assigned to MapMan functional categories (29 genes; 38.2%), with enrichment for enzyme classification, lipid metabolism, RNA biosynthesis and secondary metabolism (Figure S20b). Key targets included putative CYP81Q family genes implicated in specialised metabolism (Ghosh, 2017), an acyl‐ACP thioesterase (LOC115697587) potentially involved in hexanoyl‐CoA formation—the precursor for cannabinoid biosynthesis (Stout et al., 2012) and two POLYKETIDE SYNTHASE 4 genes (LOC115699293, LOC115700696) encoding olivetol synthase copies essential for olivetolic acid assembly in the cannabinoid pathway (Gagne et al., 2012; Innes & Vergara, 2023). We also identified 47 target genes (61.8%) in the cannabis only GT‐specific subnetwork with no corresponding MapMan annotations. Overall, this cohort of uncharacterised cannabis GT‐specific TFs and target genes is likely involved in previously uncharacterised aspects of cannabis GT biology.

The hop‐specific subnetwork comprised 6 TFs, 112 targets and 365 interactions (Figure S20c; Table S31). TFs from the MYB (3 TFs), AP2/ERF‐ERF (2 TFs) and WRKY (1 TF) families were predicted as regulators in this subnetwork (Figure S20c). Targets were predominantly assigned to secondary metabolism, enzyme classification, RNA biosynthesis and lipid metabolism categories (Figure S20d). The secondary metabolism category included alpha‐humulene synthases (LOC133781387, LOC133781637, LOC133781390) for sesquiterpene biosynthesis, phloroisovalerophenone synthases (LOC133816959, LOC133816960) critical for bitter acid production (Castro et al., 2008) and (E)‐beta‐ocimene synthases (LOC133834247, LOC133834265). The enzyme classification category included desmethylxanthohumol 6′‐O‐methyltransferase (LOC133802573) for xanthohumol biosynthesis (Patzak et al., 2021) and tropinone reductase I (LOC133801340), confirming previous proteomic findings (Champagne & Boutry, 2017).

The tomato‐specific subnetwork contained 25 TFs, 499 targets and 7106 interactions (Figure S20e; Table S32). TFs were mainly from the MYB (6 TFs), AP2/ERF‐ERF (4 TFs) and HB‐HD‐ZIP (2 TFs) families (Figure 5e; Table S22). Among the predicted regulators was Woolly (Solyc02g080260.4.1), a MYB TF involved in trichome development and specialised metabolism (Wu et al., 2023). Our GRN analysis provides predictions of Woolly's downstream target genes for in planta characterisation. Target genes were predominantly assigned to enzyme classification, RNA biosynthesis, cell wall organisation, lipid metabolism, protein modification and secondary metabolism categories (Figure 5f). The secondary metabolism category included four sesquiterpene biosynthesis genes (Solyc07g008690.3.1, Solyc07g051940.4.1, Solyc07g052150.4.1, Solyc12g006570.2.1; Table S32), consistent with tomato GT function in terpene production and defence (Wang, Yuan, et al., 2025).

Regulatory features of specialised metabolism and GT modifications are conserved across species

To identify conserved GT regulatory features, we extracted a cannabis subnetwork containing GT‐specific TFs and targets with corresponding GT‐specific orthologs in both hop and tomato, based on HOG analysis (Tables S33 and S34). This tri‐species conserved subnetwork contained 39 TFs, 381 targets and 6510 interactions (Figure S21a; Table S33). The main predicted regulators included TFs from the MYB (9), HB‐HD‐ZIP (5), bHLH (4), WRKY (3), AP2/ERF‐ERF (3) and SRS (2) families. Of the 381 targets, 241 (63.3%) were assigned to MapMan functional categories, with enrichment for enzyme classification, RNA biosynthesis, cell wall organisation, lipid metabolism, secondary metabolism, solute transport and protein modification (Figure S21b; Table S33). Notable targets included GPAT6 (LOC115697313), which functions in cutin biosynthesis and trichome morphogenesis (Li et al., 2012; Petit et al., 2016), and three 3‐ketoacyl‐CoA synthase genes (LOC115705828, LOC115709512, LOC115718963). The RNA biosynthesis category contained three HDG TFs as targets: HDG11‐like (LOC115714770) and two HDG5‐like genes (LOC115716550, LOC115716879) (Table S33). The Arabidopsis hdg11 mutant exhibits excessive trichome branching, suggesting a conserved role in trichome initiation and morphogenesis (Nakamura et al., 2006). The secondary metabolism and enzyme classification categories included flavonoid biosynthesis genes (flavanone 3‐dioxygenase‐like, flavonoid 3′‐monooxygenase, chalcone synthases and anthocyanidin rhamnosyltransferases), along with two ABC transporters (LOC115706036, LOC115716265) in the solute transport category (Xu, Zhang, et al., 2024). These findings suggest conserved TFs may regulate specialised metabolism pathways, such as flavonoid biosynthesis, as well as fundamental GT structural modifications that enable metabolite transport and storage.

We next examined features shared exclusively between cannabis and hop given their close phylogenetic relationship. This subnetwork comprised 15 TFs from 10 families, 332 targets and 1995 interactions (Figure S21c; Table S34). Of the 188 targets (56.6%) with MapMan annotations, 77 (41.0%) belonged to enzyme classification (Figure S21d; Table S34). This category included cannabinoid biosynthesis genes (LOC115704845/berberine bridge enzyme‐like 8, LOC115696909 and LOC115697880/inactive THCAS variants, LOC115696884/CBDAS‐like 1, LOC115697762/CBDAS), along with branched‐chain amino acid aminotransferase genes (LOC115715790, LOC115716705, LOC115712735) and terpene and flavonoid biosynthesis genes. The presence of these genes suggests functional divergence contributing to the distinct chemical profiles of cannabis and hop (Padgitt‐Cobb et al., 2023; van Velzen & Schranz, 2021). Additional targets in the cannabis–hop subnetwork included 7‐methylxanthosine synthase 1‐like (LOC115695753), a key enzyme in purine alkaloid biosynthesis (Mizuno et al., 2003), 3‐ketoacyl‐CoA synthase 11‐like (LOC115725554) potentially involved in cuticular wax production and pectin esterase‐like (LOC115706869) likely involved in cell wall modification (Table S34). Although putative alkaloid biosynthesis genes are expressed in cannabis and hop GTs, there are no reports of alkaloids being detected in their GTs, so these results warrant metabolomics analysis (Flores‐Sanchez & Verpoorte, 2008; Champagne & Boutry, 2017). The pectin esterase and 3‐ketoacyl‐CoA synthase genes may contribute to pectin remodelling and cuticular wax production associated with the subcuticular storage cavity formation in cannabis GTs, functions likely conserved in hop (Batsale et al., 2023; Livingston et al., 2021).

Validation of the genome‐wide target genes of candidate cannabis GT TFs

While GRNs are a valuable tool for predicting TF–target gene interactions and understanding molecular mechanisms underlying biological processes, predictions require experimental validation. We independently validated the predicted TF–target genes from our GRN analysis by experimentally identifying the genes directly targeted by a subset of candidate TFs. This was achieved by performing DNA affinity purification sequencing (DAP‐seq) of seven representative cannabis GT‐specific TFs and a tomato GT TF (Table S35). DAP‐seq is a high‐throughput in vitro assay for profiling genome‐wide TF binding sites (TFBS) in their native sequence context using purified genomic DNA (Bartlett et al., 2017; O'Malley et al., 2016). The TFs examined by DAP‐seq were LOC115714116 (SRS), LOC115698973 (MYB), LOC115695638 (AP2/ERF‐AP2), LOC115703125 (AP2/ERF‐ERF), LOC115716879 (HB‐HD‐ZIP), LOC115724534 (C2H2) and LOC115701393 (SRS). These TFs were selected because six were among the top 35 cannabis GT TFs based on their NMS and three belonged to the TF families significantly enriched in the cannabis GT‐specific GRN (hypergeometric test; P‐value <0.05) (Figure 5a). We also included the Woolly TF from tomato (SlWO), an HD‐ZIP IV TF that regulates trichome differentiation (Wu et al., 2023). The number of significant binding sites identified varied among the TFs assayed, from 318 (LOC115695638) to 5966 (LOC115701393) (Table S36). To determine which TFs had been successfully assayed, we filtered the data further according to gold standard strategies used to analyse this assay type (see Methods) (Li et al., 2023). Three TFs passed these thresholds, and we consequently focused on these for the remainder of our analyses: LOC115716879, a class IV homeodomain–leucine zipper (HD‐Zip IV) HDG5‐like TF (CsHD‐ZIP879 hereafter), LOC115714116, a SHI related sequence (SRS) 1‐like TF (CsSRS116 hereafter), and SlWO. This DAP‐seq assay failure rate is consistent with other publications and is thought to relate to the absence of co‐factors and/or heterodimeric partners in vitro (Jiao et al., 2024; O'Malley et al., 2016; Zhang et al., 2022).

We identified a total of 2125, 2924 and 4018 statistically significant binding sites (q‐value <0.05) for CsHD‐ZIP879, CsSRS116 and SlWO, respectively (Figure 6a; Tables S37–S39). For all the three TFs, the majority of the binding sites were in the distal intergenic regions: CsHD‐ZIP879, 64.3%; CsSRS116, 85.8%; SlWO 74.2% (Figure 6b; Table S40). However, a substantial number of binding sites were also found in promoter regions (up to 2‐kb upstream of a TSS); CsHD‐ZIP879 (21.7%), CsSRS116 (5.2%) and SlWO (17.1%) (Figure 6b; Table S40). The expected recognition motif for each TF was the top enriched motif in the 100 bp regions upstream and downstream of their binding sites (Figure 6c). The top enriched consensus motif for CsHD‐ZIP879 was A[A/T]T[A/T][A/G]ACACGTG, and this motif was present in 71.8% of the corresponding binding sites. This motif contains the core sequence A[A/T]T[A/T][A/G]A, which is the known recognition motif for members of the CsHD‐ZIP IV family TFs such as ATML1, PDF2, HDG7 and HDG9 in Arabidopsis (hypergeometric test; P‐value: 1e−1576) (Nakamura et al., 2006). For CsSRS116, the top enriched consensus motif was [A/C/T]ATAGGTTT[A/C/T], present in 94.5% of the corresponding peaks. This motif was identified independently as characteristic of SRS TFs, in particular SRS7 (hypergeometric test; P‐value: 1e−299) (O'Malley et al., 2016). As for SlWO, the top enriched consensus motif was AGCATT[T/A]AATGC, which was present in 63.91% of the corresponding binding sites (Figure 6c). This motif was previously identified as characteristic of HDG TFs, particularly HDG7 (hypergeometric test; P‐value: 1e−1884). The motif contains the L1‐box element (TAAATG), a cis‐regulatory sequence that SlWO has recently been shown to bind (Wang, Gao, et al., 2025).

Figure 6.

Figure 6

The identification and characterisation of CsHD‐ZIP879, CsSRS116 and SlWO genome‐wide binding sites using DNA affinity purification sequencing (DAP‐seq).

(a) Venn diagrams showing the number of shared binding sites (50% reciprocal overlap) between two biological replicates.

(b) The distribution of shared binding sites (CsHD‐ZIP879: 2125 peaks; CsSRS116: 2924 peaks; SlWO: 4018 peaks) across genomic features.

(c) DNA logos of the top significantly enriched DNA binding motifs.

(d) Dot plot of the significantly enriched gene ontology terms (biological process, molecular function and cellular component categories; Benjamini–Hochberg test; P‐adjusted <0.05) for genes whose promoter regions (<2 Kb upstream of the TSS) or first intron contained binding sites (only sites with a fold enrichment >4 in the DAP‐seq samples compared with the control were used).

(e) DAP‐seq binding profiles of candidate CsHD‐ZIP879, (f) CsSRS116 and (g) SlWO transcription factor target genes.

TF binding at gene promoters regulates transcription of those target genes, so we sought to determine the possible roles of CsHD‐ZIP879, CsSRS116 and SlWO in GT function via functional analysis of the genes with the high‐confidence binding sites of these TFs (see Methods). GO enrichment analysis showed that the genes bound by CsHD‐ZIP879 in their promoters were significantly associated with GO terms (Benjamini–Hochberg test) such as ‘response to wounding’ (P‐adjusted: 0.021), ‘regulation of jasmonic acid mediated signalling pathway’ (P‐adjusted: 0.021), ‘solute:proton symporter activity’ (P‐adjusted: 0.007) and ‘solute:monoatomic cation symporter activity’ (P‐adjusted: 0.016) (Figure 6d). Among the downstream targets of CsHD‐ZIP879 were putative JA biosynthesis (LOC115701324, allene oxidase cyclase, AOC) and catabolism (LOC115708820, cytochrome P450 94C1‐like jasmonoyl‐amino acid carboxylase) genes and two putative transport proteins (LOC115697747, mitochondrial substrate carrier family protein B and LOC115711302, sulphite exporter TauE/SafE family protein 4) (Figure 6e). These suggested that CsHD‐ZIP879 may be involved in JA‐mediated signalling pathways and solute transport. The CsHD‐ZIP879 cis‐regulatory region contains multiple hormone response (mainly of JA) and wounding elements, indicating possible regulation by JA (Ma et al., 2022). Additionally, members of the HD‐ZIP IV TFs have been shown to drive the expansion of several epidermal cell types such as trichomes, by integrating hormonal pathways including JA in multiple species (Schrick et al., 2023).

The genes with CsSRS116 binding sites in their promoter region were significantly associated with GO terms related to growth, development and cell growth and differentiation (Benjamini–Hochberg test; P‐adjusted <0.05) (Figure 6d). This observation suggested that SRS116 may be involved in biological pathways relevant to growth and differentiation. Our observation is consistent with known roles of SRS TFs, a plant‐specific family of TFs, in development and morphogenesis of multiple organs and hormone responses (Fang et al., 2023). LOC115714171 (auxin‐responsive protein IAA30), LOC115712529 (Zinc finger protein STAMENLESS 1) and LOC115708149 (polygalacturonase At1g48100‐like) are among the genes bound by CsSRS116 (Figure 6f). An Arabidopsis ortholog, AtIAA30, is a repressor of auxin responses and has been shown to be involved in vascular patterning and the differentiation of xylem cell types. Interestingly, CsHD‐ZIP879 (LOC115716879) is predicted among the target genes bound by SRS116, suggesting SRS116 functions upstream and so may be a key regulator of trichome function.

We did not identify any significantly enriched GO terms associated with the SlWO target genes assigned to high‐confidence binding sites in the promoter, potentially due to nearly half of these targets lacking functional annotations. These may consequently be novel components of trichome biology (Table S39). However, a MapMan functional categorisation of these target genes revealed biologically coherent patterns consistent with trichome development and function. Of the 459 high‐confidence binding sites analysed, 419 (91.3%) were assigned to genes with functional annotations distributed across 28 major categories. The most represented category was RNA biosynthesis (34 genes, 8.1%), predominantly containing TFs from diverse families including bHLH, MYB, C2H2‐ZF and AP2/ERF, suggesting SlWO functions as a master regulator controlling downstream transcriptional cascades. Additional categories included protein modification (23 genes, 5.5%), particularly signalling kinases and phosphatases; solute transport (23 genes, 5.5%), especially amino acid and metabolite carriers; lipid metabolism (16 genes, 3.8%), relevant to cuticle and storage cavity formation; and cell wall organisation (10 genes, 2.4%), particularly pectin metabolism genes involved in trichome structural modifications. A putative terpene synthase (Solyc05g026600), a sulfotransferase (brassinosteroid conjugation and degradation; Solyc03g006043), a membrane‐associated kinase regulator (Solyc12g036440) and an LRR receptor‐like kinase family protein (Solyc06g069650) are among the genes bound by SlWO (Figure 6f).

We next validated the performance of the DAP‐seq assays using publicly available chromatin immunoprecipitation sequencing (ChIP‐Seq) data that mapped the distributions of key histone modifications genome‐wide: from cannabis trichomes: H3K4 trimethylation (H3K4me3), H3K27 trimethylation (H3K27me3) and H3K56 acetylation (H3K56ac); and from tomato seedlings: H3K4me3 (Conneely et al., 2024; Liu et al., 2022) (Tables S37–S39 and S41–S44). The co‐localisation of TFBSs, determined by DAP‐seq, with different histone modifications provides functional validation of the assay and insight into the TF's regulatory modes. H3K4me3 is a marker of active promoters and transcriptional initiation, so co‐occurrence suggests TF‐mediated transcriptional activation. Conversely, co‐occurrence with H3K27me3, a marker of repressed chromatin associated with polycomb‐mediated silencing, implies TF involvement in gene repression. In plants, H3K56ac flanks active distal enhancers, creating euchromatin permissive for TF binding and enriches promoters of transcribed genes where it promotes expression like H3K4me3 (Deal & Henikoff, 2011; Zeng et al., 2019). We therefore quantified and assessed the significance of the overlap between TFBSs (DAP‐seq) and sites of histone modifications (ChIP‐seq) (see Methods).

Binding sites of CsHD‐ZIP879 overlapped significantly with all three histone modifications (H3K4me3: P‐value: 1.68e−53; H3K56ac: P‐value: 2.24e−34; H3K27me3: P‐value: 8.08e−104) (Figure S22a; Tables S37 and S41). The 293 genes at H3K4me3‐marked regions were enriched for GO terms associated with terpenoid, isoprenoid, phenylpropanoid and lignin biosynthesis, consistent with activating specialised metabolite production (Figure S22b). The 396 genes at H3K27me3‐marked sites were enriched for wound response, lipase activities and signalling kinases, suggesting CsHD‐ZIP879 also regulates stress responses and lipid metabolism (Figure S22b). These associations may indicate that CsHD‐ZIP879 functions as both a transcriptional activator and repressor of GT‐related processes, or that it acts in competition at some genes with other TFs which have opposing regulatory activity. CsSRS116 had no significant overlap with H3K4me3, but did overlap significantly with H3K56ac (P‐value: 6.84e−4) and H3K27me3 (P‐value: 1.97e−224), encompassing 144 and 447 genes, respectively (Figure S22c; Tables S38 and S42). Although GO enrichment revealed no significant terms, MapMan annotations indicated that H3K27me3‐marked genes were enriched for TFs and cell wall‐related genes, suggesting the involvement of CsSRS116 in repressing developmental regulators and structural genes. H3K56ac‐marked genes contained signalling kinases and diverse TFs, indicating regulation of dynamic transcriptional processes in GTs. SlWO had significant overlap with H3K4me3 (P‐value: 3.10e−7), consistent with its role as a transcriptional activator of GT development and differentiation genes (Figure S22d; Tables S39 and S44). MapMan categorisation of the 320 overlapping genes revealed RNA biosynthesis (predominantly bHLH, C2H2‐ZF and AP2/ERF TFs), protein homeostasis and protein modification as the most represented categories. It is worth noting that over 50% of genes at H3K27me3‐ and H3K56ac‐marked sites (CsSRS116) and around 40% of genes at H3K4me3‐marked sites (SlWO) lacked MapMan annotations, indicating potential novel components of GT biology. Overall, the significant associations of DAP‐seq binding sites with regulatory histone modifications at genes supports the validity of our DAP‐seq assays.

We next assessed whether DAP‐seq experimentally validated targets were enriched among computationally predicted GRN targets. Significant enrichment was detected for CsSRS116 (59 of 662 predicted targets; P‐value: 0.003), while CsHD‐ZIP879 and SlWO showed no significant enrichment (36 out of 500 predicted targets, P‐value: 0.619; 102 out of 926 predicted targets, P‐value: 0.049, respectively) (Figure 7a; Tables S37–S39). This suggests the presence of false‐positives or negatives among the GRN predictions for CsHD‐ZIP879 and SlWO. However, the percentage of validated TF–target interactions we reported here (CsHD‐ZIP879: 7.2%; CsSRS116: 8.9%; SlWO: 11%) falls within the range reported by a recent study involving similar approaches (Zander et al., 2020). We also tested for enrichment of DAP‐seq targets within WGCNA modules co‐expressed with each TF (see Methods; Tables S37–S39). Significant enrichment was detected for all three TFs: CsHD‐ZIP879 and CsSRS116 DAP‐seq targets were enriched in the cannabis darkmagenta module (171 out of 2082 and 166 out of 2082 genes, P‐value: 3.32e−4 and 1.98e−3, respectively), and SlWO DAP‐seq targets were enriched in the tomato yellowgreen module (68 out of 532 genes, P‐value: 1.37e−2) (Figure 7b). Notably, while the darkmagenta and yellowgreen modules were enriched for GT‐expressed genes (1253 out of 2082; 60.2% and 346 out of 532; 65.0%, respectively), they also contained substantial proportions of genes highly expressed in flowers (810 out of 2082; 38.9%) and leaves (148 out of 532; 27.8%), respectively, consistent with the multiorgan expression patterns of these TFs (Tables S1 and S3).

Figure 7.

Figure 7

DNA affinity purification sequencing (DAP‐seq) validation of weighted gene co‐expression network analysis (WGCNA) and gene regulatory network (GRN) predictions.

(a) Venn diagrams showing the overlap between the number of genes associated with DAP‐seq binding sites (CsHD‐ZIP879: 1904 genes; CsSRS116: 1915 genes; SlWO: 3333 genes) and the number of target genes predicted by GRN analysis (CsHD‐ZIP879: 500, CsSRS116: 662 and SlWO: 926).

(b) Venn diagrams showing the overlap between the number of genes associated with CsHD‐ZIP879, CsSRS116 and SlWO DAP‐seq binding sites and the number of genes found in the same co‐expression modules as these transcription factors (darkmagenta: 2082; yellowgreen: 532) identified via WGCNA. The significance of the overlaps in (a) and (b) was assessed using the hypergeometric test (P‐value <0.05).

DISCUSSION

GTs are important because of their unique role in the biosynthesis of a wide range of specialised metabolites that have biological, ecological and commercial significance. Here, we describe a transcriptomics atlas focused on GTs of three crop species to improve our understanding of their distinctive biology. We leverage organ‐specific gene expression data across several tissue types to identify which expression patterns are characteristic of GTs. From this, we provide the most comprehensive, current characterisation of shared and specific patterns of gene expression in GTs across the three species via the principles of orthology. We also reconstruct weighted gene co‐expression networks and the GRNs governing these specific and shared transcriptional programmes. Overall, our results provide fundamental insights into GT biology, including how the key processes operating in GTs are regulated.

Understanding GT‐specific gene expression is valuable to identify genes that are involved in GT‐associated developmental processes and specialised biosynthetic pathways. We have shown here that different aspects of GT gene expression are either shared between species or specific to individual species (Figure 2). Genes that were expressed in common by GTs of all three studied species were mainly associated with processes that are common to all GTs. These shared processes included primary (fatty acid) and specialised (flavonoid) metabolism, trichome and cell differentiation, cell wall reorganisation, as well as substrate or metabolite transport. Fatty acids can act as the building blocks for many specialised metabolites, including the cannabinoids in cannabis and bitter acids in hop that are derived from fatty acids or contain fatty acid moieties (Balcke et al., 2017; Clark et al., 2013; Stout et al., 2012; Xu et al., 2013). So, the enrichment of genes involved in lipid metabolic processes is consistent with the prolific production of these compounds in GTs. Flavonoids are also common in GTs, contributing to terpene biosynthesis and acting as ROS scavenging antioxidants (Sugimoto et al., 2022). Expression of genes associated with metabolite transport and GT differentiation was also conserved among species, processes that are likely involved in the storage of specialised metabolites within subcellular structures (Livingston et al., 2021; Schuurink & Tissier, 2020). The GT‐genes expressed species‐specifically were largely associated with the major specialised metabolites produced by individual species. These included cannabinoid biosynthesis in cannabis and amino acid metabolism/catabolism in hops. Interestingly, many of the species‐specific GT genes have no known functions, so represent appealing putative candidates for focused investigation to increase our knowledge of GT metabolism.

GRNs reconstruct and visualise putative regulatory relationships between TFs and their downstream target genes, enabling the association of these with traits of interest (Chang et al., 2013; Van den Broeck et al., 2020; Zander et al., 2020). While we focused on GT‐specific gene regulation in the present study, the transcriptomics atlases of the remaining organs (flower, leaf, stem and root) also allowed us to reconstruct GRNs for these organs in cannabis (Tables S45–48). These provide a valuable resource for investigating broader organ specificity of transcriptional regulation. Construction of the GT GRNs allowed for shared and unique regulatory features among the three species to be identified. The cannabis‐specific GT subnetwork contained target genes that were mainly involved in the cannabinoid biosynthetic pathway and many others with unknown functions. Similarly, the target genes included in the hop‐specific GT subnetwork were primarily associated with terpene and amino/organic acid metabolism (Figure S20a,b). These findings are consistent with the known specialised metabolites present in the GTs in both species. In contrast, TFs shared between species were predicted to regulate processes that are common across GT biology, including flavonoid biosynthesis and those related to GT modification. The most influential components of the GT GRNs were identified by network motif analysis, using gene expression data only. These components were all TFs, according to independent gene functional annotations. It is expected that the most influential components in a GRN are TFs, and the consistency between our network structure and gene annotations supports the validity of our GRNs. Moreover, the cognate binding motifs of these predicted regulatory TFs were enriched among their downstream target genes.

DAP‐seq is a powerful in vitro approach that determines TF–target interactions genome‐wide. However, a known technical limitation of DAP‐seq is its varying success rates between TFs (Jiao et al., 2024; O'Malley et al., 2016; Zhang et al., 2022). We experienced this issue, with only three out of eight TFs producing high‐confidence DAP‐seq datasets. This arises partly because the DNA‐binding ability of many TFs depends on specific co‐factors or interacting proteins. Refining assay conditions and co‐expressing TFs with their binding partners could help enhance the effectiveness of DAP‐seq (Jiao et al., 2024; O'Malley et al., 2016). However, the statistically significant overlaps between DAP‐seq and histone modification data indicate that our assays identified biologically relevant TFBSs. The modest magnitude of overlap is expected given that the ChIP‐seq data from cannabis trichomes and tomato seedlings represent limited developmental stages and tissue contexts, while DAP‐seq comprehensively maps potential binding sites genome‐wide, including those in chromatin regions inaccessible or organs/tissues absent from the ChIP‐seq experiments.

DAP‐seq is valuable for validating GRN predictions, but a modest overlap between the results of these complementary approaches is expected due to their fundamental technical differences (Zander et al., 2020). DAP‐seq identifies physical binding sites in vitro using purified naked DNA, while GRN analysis infers regulatory relationships from gene expression correlations (Bartlett et al., 2017; O'Malley et al., 2016). However, physical binding does not always indicate a regulatory effect because DAP‐seq can generate false‐positives by detecting binding sites that are inaccessible or non‐functional in vivo due to closed chromatin, as shown by the limited overlap between DAP‐seq binding and DNase I hypersensitivity sites (O'Malley et al., 2016). Conversely, DAP‐seq can generate false‐negatives by missing indirect targets (e.g., A TF activating a second TF, which then activates target genes) that are captured by GRN analysis. Some functional targets that require the binding of a TF complex would also be missed by the single‐protein DAP‐seq experiments, resulting in a false‐positive compared with the functional GRN. Additionally, GRNs constructed from single time‐point transcriptomics data may miss temporal regulatory dynamics or generate false‐positives/negatives due to delayed TF effects or complex feedback loops (Varala et al., 2018; Walley et al., 2016). This was evident in our results where many DAP‐seq‐identified targets of CsHD‐ZIP879, CsSRS116 and SlWO were not present in the GT‐specific GRNs (Figure 7a). In future, inclusion of other data modalities such as chromatin accessibility, protein–DNA interaction data and time‐series expression, would substantially improve the accuracy of GRN inferences (Clark et al., 2021; Jiao et al., 2024; Walley et al., 2016).

Despite modest GRN validation rates, DAP‐seq targets for all three TFs showed significant enrichment within their respective GT‐enriched WGCNA co‐expression modules (Figure 7). WGCNA identifies gene co‐expression modules based on the principle that TFs often co‐express with their target genes, providing an alternative framework for predicting regulatory relationships (Allocco et al., 2004). This approach revealed substantially stronger overlap between DAP‐seq targets and co‐expressed genes within GT‐enriched modules compared with GRN predictions. This improved concordance reflects key differences between the two analytical approaches. While GT‐specific GRNs were restricted to GT‐specific TFs and targets, the GT‐enriched WGCNA modules contained both GT‐specific genes and substantial proportions of genes highly expressed in other organs such as flowers and leaves. This suggests that the candidate GT TFs, despite being most highly expressed in GTs, also exhibit lower expression in other organs and likely regulate both GT‐specific pathways and shared developmental or metabolic processes across multiple tissues. Our strict GRN analysis, which excluded non‐GT‐specific targets, therefore likely omitted genuine regulatory interactions occurring in multiple tissue contexts. The stronger WGCNA‐DAP‐seq concordance thus reflects the broader regulatory scope of these TFs beyond GT‐exclusive functions. Further, WGCNA modules capture both direct targets and genes regulated through downstream TFs or shared regulatory mechanisms, while DAP‐seq identifies all potential binding sites in vitro without distinguishing between tissue‐specific and multi‐tissue regulatory interactions (Yin et al., 2021). In contrast, GRN predictions attempt to infer specific regulatory edges within a single tissue context, which are more stringent and prone to false‐negatives due to the complexity of transcriptional regulation involving combinatorial TF binding, chromatin context and tissue‐specific co‐factors (Bartlett et al., 2017; Marbach et al., 2012; O'Malley et al., 2016; Varala et al., 2018). Together, these complementary approaches provide orthogonal validation of DAP‐seq results while revealing that GT‐associated TFs participate in regulatory networks spanning multiple tissues, not solely GT‐specific pathways.

Transcriptome analysis, WGCNA, GRN inference and network motif analysis successfully identified known and novel GT regulators, demonstrating the power of this combined approach. Our finding aligns with studies in plants, such as A. annua, tomato and tobacco, which show that transcriptional regulation of GT development and specialised metabolite biosynthesis is primally controlled by MYB, HD‐ZIP IV and bHLH TF families (Chalvin et al., 2020). We identified the same TF families as major candidate regulators of GTs in all three species (Figure 3). MYB TFs (e.g., SlMXTA1, AaMIXTA1 and AaMYB17) are the main positive regulators of GT initiation/morphogenesis (Ewas et al., 2016; Qin et al., 2021; Shi et al., 2018). Additionally, HD‐ZIP IV TFs (e.g., SlWO, SlCD2, SlHD3, SlHD9, NtHD9 and NtHD12) are crucial for establishing GT cell identity and morphogenesis, with SlWO notably acting as a master regulator by activating SlMXTA1 to control the transition of cell fate from GTs to non‐GTs (Li et al., 2025; Wu et al., 2023; Xu, Teng, et al., 2024; Zhao et al., 2024). More recently, SlHDZ38 (a HD‐ZIP subfamily I TF) was shown to regulate GT development and specialised metabolism in tomato (Zocca et al., 2025). The bHLH TFs are critical promoters of GT initiation, typically functioning as part of the MBW complex alongside MYB and WD40 TFs. Specific bHLH members, such as AaMYC3 in A. annua exhibit a dual regulatory role: they both initiate GT development by activating AaHD1 and simultaneously boost artemisinin biosynthesis by activating pathway genes like CYP71AV1 and ALDH1, and co‐activating ADS and DBR2 (Yuan et al., 2025; Zhao et al., 2024). Similarly, the tomato bHLH SlMYC1 regulates the initiation and development of type VI GTs and associated terpene biosynthesis (Hua, Chang, Wu, et al., 2021; Xu et al., 2018). Recent research continues to expand the known regulatory network, implicating additional families, such as the Scarecrow‐like (SCL) family, with members like SlSCL3 regulating GT development and terpene metabolism in tomato (Yang et al., 2021). This suggests that a broader range of TFs may play regulatory roles in trichome biology. Consequently, the other TF families identified across the transcriptomes of all three studied species represent interesting targets for advancing our understanding of trichome biology.

A crucial aspect of GT biology that is yet to be studied in detail is understanding how GT initiation and development is regulated (Huchelmann et al., 2017). Such data are essential for crop improvement efforts in species where the contents of GTs are the key product and might be employed for targeted modification of GT size and density, leading to improvement of metabolite composition and yield. This could be further improved by identifying more TFs using a similar approach but resolved across a time‐course that spans both the pre‐ and post‐GT initiation stages. Moreover, the use of state‐of‐the‐art single‐cell RNA sequencing and high‐resolution spatial transcriptomics analyses may resolve the metabolic networks and regulatory mechanisms that facilitate metabolite production in highly specialised trichome cell‐types, revealing the functional diversity among the various GT types (Hurgobin & Lewsey, 2022).

In summary, we have shown that fundamental aspects of GT biology are shared widely across species. We have also provided the predicted architecture of how species‐specific GT gene regulation is exercised. These resources are of broad applicability to the GT research community, as well as providing useful candidates for crop improvement and synthetic biology or biotechnology applications where GTs might be reprogrammed.

EXPERIMENTAL PROCEDURES

Plant material and growth conditions

Three GT‐producing crop species that are known to produce diverse specialised metabolites were selected for this study. These included cannabis (a high‐THC cultivar described as ‘Northern Lights’), hop (cv. Cascade) and tomato (cv. Black Krim). Rhizomes from established Cascade female hop plants were obtained from Silver Springs Hops and Permaculture Farm in Australia (https://www.silverspringshopsfarm.com.au/). Seeds of the tomato cultivar Black Krim were purchased from Seeds of Plenty, Australia (https://seedsofplenty.com.au/). Cuttings taken from ‘Northern Lights’ female plants were grown in a controlled environment growth room (18 h/6 h light/dark photoperiod; light intensity = 200 μmol m−2 sec−1; 24°C/21°C day/night temperature) for 2 weeks. After 2 weeks, rooted cuttings were transferred to individual pots and were grown under a long‐day photoperiod (18 h/6 h light/dark) to promote vegetative growth. After 2 weeks, plants were transferred to a short photoperiod (12 h/12 h light/dark) for flowering for an additional 4 weeks. Cascade rhizomes were planted in 250 ml pots using a standard potting mixture from BioGro (https://biogro.com.au/) supplemented per each 30 L bag with 0.5 L coarse vermiculite, 0.5 L coarse perlite, 35 g Macracote Coloniser Plus 4‐month slow‐release fertiliser (15N‐3P‐9K), 30 g nitrogen slow‐release fertiliser (40N‐0P‐0K), 25 g water‐holding granules, 15 g trace elements (6Mg‐6.5Fe‐5.4S‐1.5Mn‐0.4Zn‐0.14B‐0.07Mo) and 5 g garden lime. The plants were grown for a total of 14 weeks (a 9‐week vegetative period followed by a 5‐week flowering period) before sampling. Seeds of Black Krim were sown on a Plugger professional growing media ProMix (https://www.agsolutions.net.au/plugger), and 3‐week seedlings were transplanted into individual 1.5 L pots on standard potting mixture and grown for 7 weeks. Both hop and tomato plants were grown under constant temperature (22°C/14°C day/night) and an 18 h light/6 h dark photoperiod.

Sample collection

Fresh samples for RNA extraction were collected in three biological replicates from different organs (root, stem, leaf and flower) of mature cannabis (6 weeks of growth post‐transplanting of rooted cuttings into individual pots including 4 weeks in the flowering short photoperiod) and hop (14‐week‐old) and tomato (10‐week‐old) plants. Trichomes were isolated from female flowers (cannabis and hop) and from stem, leaf, pedicel and sepal (tomato) according to a method modified from protocols previously described (Vincent et al., 2019; Yerger et al., 1992). Briefly, ~5 g samples were transferred to 50 ml Falcon™ tubes and were broken into smaller pieces in liquid nitrogen. Then, about 10 ml of liquid nitrogen was added to the tubes, which were loosely capped and vortexed until the majority of trichomes were removed. Plant debris was removed by inverting and tapping the tubes while the trichomes stuck to the walls. Trichome‐enriched samples were carefully transferred to 2 ml Eppendorf tubes in liquid nitrogen for further processing. This step was repeated until most of the trichomes were recovered from the wall of the tube. Trichomes were not removed from the leaf, stem and flower samples except for the tomato flower sample from which the trichome bearing floral tissues (pedicel and sepal) were removed for RNA isolation.

RNA extraction, RNA‐library preparation and sequencing

For RNA extraction, samples were homogenised using a Geno/Grinder® 2010 (SPEX SamplePrep, Metuchen, NJ, USA). Total RNA was isolated from homogenised samples using the Spectrum Plant Total RNA kit (Sigma‐Aldrich, Merck KGaA, Darmstadt, Germany) according to the manufacturer's protocol with the optional on‐column DNase digestion step applied for removing genomic DNA. RNA concentration and quality were determined using the Agilent 2200 TapeStation system (Agilent Technologies, Waldbronn, Germany). RNA‐seq libraries were generated using the Illumina TruSeq Stranded mRNA Library Prep Kit. The libraries were sequenced on Illumina NextSeq 500 (cannabis) and NovaSeq 6000 (hop and tomato) systems, generating on average ~20 million 80–100‐bp single reads per sample. The data generated in this study can be found in Tables S49–S51.

RNA‐seq data analysis and identification of organ‐specific genes and TFs

The quality of the raw RNA‐seq data was assessed using FastQC v0.11.8 (Andrews, 2010). The RNA‐seq reads were aligned in a species‐specific manner against the cannabis (cv. cs10 v2.0; GCF_900626175.2 from NCBI), hop (cv. Cascade; GCF_963169125.1 from NCBI) and tomato (cv. Heinz 1706; SL4.0 from https://solgenomics.net/) reference genomes using Hisat2 v 2.2.1 (Grassa et al., 2021; Hosmani et al., 2019; Kim et al., 2019; Padgitt‐Cobb et al., 2021). Overall mapping rates of RNA‐seq reads against cannabis, hop and tomato reference genomes were 87.6 ± 1.4%, 76.4 ± 4.85% and 95.7 ± 0.41% (mean ± standard error), respectively (Figure S23). The mapped reads were sorted by genomic location using samtools v1.9 (Kim et al., 2019; Li et al., 2009). Gene‐level quantification was performed using featureCounts from Subread v2.0 and transcript per million (TPM) counts were generated from the mapped reads using StringTie v2.1.3 (Pertea et al., 2015). Principal component analysis plots were generated in R v4.1.1 using the log2‐normalised TPM counts to check for the presence of outliers among the biological replicates of each organ across species (Team, 2020). Organ specificity was calculated in a species‐specific manner using the Tau (τ) tissue‐specificity index and the mean TPM counts for each species‐specific gene; an organ‐specific gene was identified as a gene that has a τ >0.8 and a mean TPM count of >0.5 across biological replicates of the corresponding organ (Kryuchkova‐Mostacci & Robinson‐Rechavi, 2017). Heatmaps showing gene expression of organ‐specific genes across species were generated using the Z‐score of log2‐normalised mean TPM counts and the pheatmap package in R (https://github.com/raivokolde/pheatmap). MapMan functional category bins were assigned to the cannabis, hop and tomato genes using the online tool Mercator4 v2.0 (https://mapman.gabipd.org/app/mercator) (Thimm et al., 2004). TFs were predicted across the three species using iTAK v1.7a33 (Zheng et al., 2016).

Correlation analysis with public RNA‐seq datasets

Publicly available RNA‐seq data from cannabis, hop and tomato organs (trichomes, flower, leaf, root and stem) were downloaded from the Sequence Read Archive (Table S5). Transcript abundance was quantified using Salmon v1.4.0 using mapping validation and bias correction parameters (‐‐validateMappings ‐‐gcBias ‐‐seqBias ‐‐dumpEq) (Patro et al., 2017). Gene‐level TPM values were derived using the R package, tximport 1.32.0, with transcript‐to‐gene mappings obtained from the corresponding genome annotations. To assess transcriptome similarity for each organ within each species, Spearman's rank correlation coefficient (ρ) was calculated between biological samples using the expression profiles of organ‐specific genes (determined using Tau) and the in‐built ‘cor’ function in R (method = ‘spearman’). Correlation matrices were visualised using the R package, pheatmap v1.0.13.

GO enrichment analysis

GO terms were predicted for cannabis and tomato genes using PANNZER (default parameters) (Törönen & Holm, 2022). The GO terms for hop genes were obtained from GCF_963169125.1‐RS_2024_01_gene_ontology.gaf on NCBI. These were then used to build custom annotation packages for each of the three species using the R package AnnotationForge (Carlson & Pagès, 2022). GO enrichment analysis (Benjamini–Hochberg test; P‐adjusted <0.05) and visualisation of candidate genes and TFs was performed in R using the clusterProfiler v4.12.6 and ggplot2 v3.5.2 packages (Wickham & Wickham, 2016; Yu et al., 2012).

Validation of GT‐specific gene identification

To assess whether GTs present on flowers and stems affected GT‐specific gene identification in cannabis, we generated RNA‐seq data from GT‐depleted cannabis flowers (Table S49). Tau analysis was performed comparing gene expression across five organs using either GT‐depleted flowers (GT‐depleted flower, GTs, leaf, stem, root) or intact flowers (intact flower, GTs, leaf, stem, root). The overlap between GT‐specific gene sets from the two analyses was quantified to assess the impact of trichome presence on gene classification.

To determine whether GTs present on leaves and stems affected the identification of GT‐specific genes in tomato, we validated our GT‐specific genes against 2318 genes that were significantly upregulated in isolated GTs compared with GT‐removed leaves in the cultivated tomato accession, LA4024 (log2FC ≥1, adjusted P‐value <0.05) from an independent study by Balcke et al. (2017). Since the authors used the ITAG2.4 tomato genome annotation and our study used the SL4.0 (ITAG4.0) annotation, we aligned the ITAG2.4 protein sequences against ITAG4.0 protein sequences using blastp from DIAMOND v2.0.13, retaining the top hit for each query (Buchfink et al., 2015). This mapping yielded 2258 genes in the ITAG4.0 annotation corresponding to Balcke et al.'s GT‐upregulated genes. We then evaluated how many of these 2258 genes overlapped the 1227 GT‐specific genes identified in our study.

To address whether photosynthesis‐related GO enrichment among GT‐specific genes in tomato reflected biological differences in leaf developmental stage rather than GT contamination, we generated additional RNA‐seq data from young tomato leaves (Table S51). First, we identified differentially expressed genes between mature and young leaves using DESeq2 v1.44.0 (|log2 fold change| ≥1, Benjamini–Hochberg adjusted P‐value <0.05). Second, we performed two parallel analyses to identify GT‐specific genes using either mature leaf or young leaf transcriptomes. For each analysis, τ values were calculated across five organs (mature/young leaf, GT, flower, stem, root). The overlap between GT‐specific gene sets identified using mature versus young leaves was quantified.

Identification of conserved/shared and unique organ‐specific genes and TFs

HOGs groups were inferred across cannabis, hop and tomato using OrthoFinder v2.5.4 (default parameters) (Emms & Kelly, 2015, 2019). The amino acid sequence of the longest transcript variant per gene (protein‐coding only) was used as input for each species. Shared/conserved genes and unique/species‐specific genes were extracted from the HOGs using custom R scripts. Shared genes are defined as genes that are present in HOGs that contain genes from two or more species while unique genes are defined as genes that are present in HOGs that contain genes from a single species only. The unique genes also include genes that were not assigned to any HOGs.

Weighted gene co‐expression analysis (WGCNA)

Gene count data from the three species were each normalised using variance stabilising transformation (VST) in DESeq2 after removing genes with low expression (gene count ≤10 across all samples in each species). WGCNA was performed using the WGCNA R package (v1.75‐5). The top 75% most variable genes were selected for network construction based on variance across samples, yielding 11 365 genes (cannabis), 19 122 genes (tomato) and 15 228 genes (hop). Signed adjacency matrices were calculated using Pearson correlation raised to soft‐thresholding power β = 6, selected based on scale‐free topology (R 2 ≥0.85). The topological overlap matrix (TOM) dissimilarity measure (1‐TOM) was used for hierarchical clustering. Co‐expression modules were identified using dynamic tree cutting (deepSplit = 2, minModuleSize = 30), and highly correlated modules were merged, yielding 10 (cannabis), 14 (tomato) and 14 (hop) final modules. For the module‐organ association analysis, module eigengene expression (ME) was compared across the five organs in each species to identify organ‐specific co‐expression patterns. One‐way analysis of variance (anova) was performed for each module in each species to test whether ME expression differed significantly among organs (ME~organ; P‐value <0.05). Mean ME values were calculated for each organ and visualised using row‐scaled heatmaps to highlight organ‐specific expression patterns.

GT‐specific GRN reconstruction and visualisation

GT‐specific GRNs were built for cannabis, hop and tomato using SCION v3.0 (default parameters) (Clark et al., 2021). SCION requires a minimum of five biological replicates per organ and outputs predicted regulatory interactions with confidence weights, where higher weights indicate greater confidence in the inferred edge (Huynh‐Thu et al., 2010).

For each species, SCION required four inputs generated using the R utility script create_data_tables.R: (1) a target matrix of row‐normalised TPM values from GT samples; (2) a target list of GT‐specific genes and TFs; (3) a regulator list of GT‐specific TFs; and (4) a regulator matrix identical to the target matrix. Specific samples used to build the target matrices were: cannabis: three biological replicates from a high‐THC cultivar (this study; Table S49) and three from a high‐CBD cultivar (PRJNA884161); hop: three biological replicates from cv. Cascade plus publicly available GT samples (Table S50); tomato: nine biological replicates from cv. Black Krim (Table S51). Complete target and regulator lists for each species are provided in Tables S52–S54.

Species‐specific and conserved GT‐specific subnetworks were extracted from the GRNs generated above (Tables S8–S10) by filtering the ‘Regulator.conservation’ and ‘Target.conservation’ columns (referred simply here as ‘conservation’). Species‐specific subnetworks contained TFs and targets lacking orthologs in the other two species: cannabis‐specific (conservation = ‘cannabis’; Table S30), hop‐specific (conservation = ‘hop’; Table S31) and tomato‐specific (conservation = ‘tomato’; Table S32). Conserved subnetworks shared between species were extracted from the cannabis GRN (Table S8): a cannabis–hop shared subnetwork (conservation = ‘cannabis.hop’; Table S33) and a conserved subnetwork involving the three species (conservation = ‘cannabis.hop.tomato’; Table S34). Network visualisation was performed using Cytoscape v3.9.1 (Shannon et al., 2003).

Network motif score (NMS) analysis

To calculate Network Motif Scores (NMS) for the cannabis, hop and tomato GT GRNs separately, four different motifs that carry a major function in biological systems were used: three chain, feed‐forward loop, bi‐parallel and bi‐fan (Clark et al., 2019; Milo et al., 2002). The number of times every gene appeared in a certain motif was counted with the NetMatchStar app in Cytoscape (Rinnone et al., 2015; Shannon et al., 2003). The counts obtained were normalised to a scale from 0 to 1, then summed and multiplied by the number of occurrences to calculate NMS for each gene. The NMS calculations for the tomato GT GRN were performed on a network with trimmed edges (weight cut‐off of 0.5) due to the lack of computational power.

GRN target motif enrichment analysis

The 2‐kb regions upstream of the transcriptional start site for all target genes in cannabis, hop and tomato GT GRNs were extracted using bedtools v2.30.0 for motif enrichment analysis (Quinlan & Hall, 2010). Then, the enriched binding motifs of the promoter regions were determined using XSTREME v5.5.2 in the MEME Suite website server with default parameters except that the option ‘right ends’ was selected for the category ‘Align sequences for site positional diagrams’ (Bailey et al., 2015). The built‐in motif library, ARABIDOPSIS (A. thaliana) DNA‐DAP motifs was used for the known motif information (Bailey et al., 2015; O'Malley et al., 2016). Enriched motifs were reported if they had an E‐value ≤0.05 by SEA.

TF family enrichment analysis

To determine whether specific TF families were overrepresented among GT‐specific TFs, we performed hypergeometric enrichment tests for each TF family in cannabis, hop and tomato.

The hypergeometric distribution test was implemented using the phyper function in R; this test calculates the probability of observing a given number of TFs from a particular family among GT‐specific TFs by chance, given the genome‐wide representation of the TF family. For each TF family, the test parameters were: the number of GT‐specific TFs in that family (q), the total number of TFs in that family genome‐wide (m), the total number of TFs not in that family (n) and the total number of GT‐specific TFs (k). The test was performed with lower.tail = FALSE to assess enrichment (overrepresentation). P‐values were transformed to −log10(P‐value) for visualisation, and TF families with −log10(P‐value) >1.3 (corresponding to P‐value <0.05) were considered significantly enriched. Results were visualised as bubble plots.

To determine whether enrichment patterns were conserved or species‐specific, we performed pairwise comparisons between species using Fisher's exact tests. P‐values were adjusted for multiple testing using the Benjamini–Hochberg false discovery rate (FDR) correction (FDR <0.05).

DAP‐seq experiment

DAP‐seq genomic DNA library preparation

Genomic DNA (gDNA) was extracted from tomato and the same high‐THC cannabis chemotype used for RNA‐seq using the DNeasy plant kit (Qiagen, Hilden, Germany). The gDNA library was constructed using the NEBNext® Ultra™ II FS DNA library prep kit for Illumina (New England Biolabs, NEB, Ipswich, MA, USA) according to the manufacturer's instructions. Briefly, ~500 ng DNA was fragmented to an average of 200 bp and end‐repaired by incubating the samples in NEBNext Ultra II FS reaction buffer and NEBNext Ultra FS enzyme mix at 37°C for 12 min. The size of the fragmented DNA was checked on Agilent 4200 TapeStation system (Agilent Technologies, Waldbronn, Germany). End‐repaired DNA fragments were ligated with the NEBNext adaptor for Illumina in a 68.5 μl ligation reaction and were then treated with USER® enzyme in a subsequent reaction. The ligation mixture was cleaned up and size‐selected using the SPRIselect sample purification beads (Beckman Coulter, Brea, CA, USA). After two washes using 80% ethanol, DNA was eluted from the magnetic beads by adding 0.1× TE buffer, quantified using Qubit™ Flex (Thermo Fisher Scientific, Waltham, MA, USA) and used as a gDNA library for DAP‐seq.

TF protein expression

The coding sequences (CDS) of candidate TFs were cloned into the pH6HTN His6HaloTag T7 Vector using the XbaI and NotI restriction sites (SacI and Sbf restriction sites for SlWO) (Promega, Madison, WI, USA). The primer pairs used for amplifying the CDS are provided in Table S56. For protein expression, binding of HALO‐fusion protein, binding of DNA to protein and DNA recovery, we followed the protocols of Bartlett et al. (2017). Briefly, 1 μg HALO‐tagged TFs were expressed using an in vitro TNT coupled T7 wheat germ extract protein expression system (Promega) in a 50‐μl reaction volume incubated for 2 h at 30°C. Expression of Halo‐TF proteins were confirmed by Western blotting using anti‐HaloTag monoclonal antibody (Promega, G9211).

DAP‐seq

DAP‐seq experiments were performed in duplicate, and the empty pHalo vector expressing only the HaloTag was used as negative control. About 40 μl of the expressed HALO‐TF protein was bound to 10 μl washed magne HaloTag beads (Promega) resuspended in PBS+NP40 solution by incubation at room temperature for 1 h on a rotator. The HaloTag‐TF‐bound beads were gently washed in PBS+NP40 solution 3×, and finally resuspended in 40 μl of the same solution. For binding DNA to protein, ~40–70 ng gDNA library was added to the resuspended beads and the final reaction volume was brought to 80 μl with elution buffer (EB, Qiagen). The mixture was incubated on a rotator at room temperature for 1 h. After incubation, the beads were washed with PBS+NP40 solution five times. After the final wash, the beads were resuspended in 30 μl of EB, incubated in a thermocycler at 98°C for 10 min and immediately placed on ice for 5 min. DNA was recovered by quickly spinning the plate at room temperature for 5 sec at 3000 g and placing it on the magnetic rack and transferring the 25 μl of the supernatant to a new PCR plate.

We followed the NEBNext Ultra II FS DNA library prep protocol for PCR enrichment, indexing and pooling of individual libraries. PCR enrichment and indexing of adaptor‐ligated and TF bound DNA library was carried out using the Index (i7) and universal PCR (i5) primers and 8 PCR cycles. Equal volumes of each reaction were pooled, and the pooled library was sequenced on an Illumina NovaSeq X plus 10B or 25B flow cell, using 2 × 150 bp paired‐end sequencing.

DAP‐seq data analysis

The quality of the raw data (averaging 50 million reads per replicate) was assessed using FastQC v0.11.8 (Andrews, 2010). Sequencing adapters were trimmed using Trim Galore! v0.6.7 (https://github.com/FelixKrueger/TrimGalore). The cleaned reads from cannabis and tomato were aligned to the cs10 and SL4.0 reference genomes, respectively, using BWA v0.7.18 (Li, 2013; Li & Durbin, 2009, 2010). The overall mapping rates were 80 ± 0.45% and 94 ± 0.47% for cannabis and tomato, respectively (Table S35). The mapped reads were sorted by genomic location using samtools v1.9 and filtered to retain only uniquely mapped reads (MAPQ >30) (Kim et al., 2019; Li et al., 2009). This yielded an average of approximately 25 million reads per biological replicate for cannabis and 45 million reads per replicate for tomato. Peaks were called on the MAPQ‐filtered reads using MACS2 (q‐value cut‐off of 0.05) against the pooled pHALO negative control samples (cannabis: pHalo_Cs; tomato: pHalo_Sl) for background subtraction (Zhang et al., 2008). Shared peaks between biological replicates were identified using bedtools v2.29.2 and a reciprocal overlap of at least 50% for the two replicates (Quinlan & Hall, 2010). The shared peaks were annotated for the nearest gene using annotatePeaks.pl from HOMER v5.1 (Heinz et al., 2010). Only peaks that were annotated and had a fold enrichment >4 when comparing the biological replicates against the pooled control samples (MACS2) were considered for subsequent analyses (high‐confidence peaks). GO enrichment analysis (Benjamini–Hochberg test; P‐adjusted <0.05) was performed on the genes assigned to high‐confidence peaks located in the 2‐kb promoter or first intron. The 100 bp upstream and downstream regions of the peak summit of each high‐confidence peak were extracted using bedtools. Motif enrichment analysis was performed on these regions using findMotifs.pl from HOMER using the in‐built motif database. The hypergeometric test (P‐value <0.05) was used to test for significant enrichment of DAP‐seq targets within two gene sets: (1) GRN‐predicted TF targets, and (2) genes co‐expressed with each TF in their respective WGCNA modules (darkmagenta for CsHD‐ZIP879 and CsSRS116; yellowgreen for SlWO).

Validation of DAP‐seq data using published ChIP‐seq data

ChIP‐seq peak data for histone modifications in cannabis GTs (H3K4me3: 18 913 peaks, H3K27me3: 10 284 peaks, H3K56ac: 13 590 peaks) were obtained from Conneely et al. (2024). For tomato, publicly available H3K4me3 ChIP‐seq data from seedling tissue (untreated samples and corresponding input controls; SRR18614503, SRR21274839, SRR21274841, SRR21274843) were processed as described by Liu et al., resulting in 19 931 peaks (Liu et al., 2022). Validation of the DAP‐seq binding sites using the ChIP‐seq data followed the approach of Jiao et al. (2024) The cs10 and SL4.0 genomes were each segmented into 2‐kb windows using bedtools makewindows and windows overlapping with the DAP‐seq binding sites and ChIP‐seq peaks were quantified using bedtools intersect. Genes within these overlapping regions were identified using bedtools closest. The hypergeometric test (P‐value <0.05) was used to assess the significance of the overlap between DAP‐seq and ChIP‐seq.

Quantitative reverse transcription‐PCR (qRT‐PCR)

Total RNA was extracted from both pure trichomes and flower samples from which trichomes had been removed, using the Spectrum Plant Total RNA kit (Sigma/Merck). The resulting RNA samples were reverse transcribed into cDNA using the Tetro cDNA Synthesis Kit (Bioline, Meridian Biosciences, London, UK) following the kit instructions. qRT‐PCR was conducted in 10 μl reactions using SYBR Select Master mix (Applied Biosystems, Thermo Fisher Scientific, Vilnius, Lithuania) and the Applied Biosystems QuantStudio 12K Flex Real‐Time PCR Instrument (Applied Biosystems, Thermo Fisher Scientific, Singapore). The expression levels of candidate TFs (LOC133818021, WRKY; LOC133782390, C2H2; LOC133819588, MYB; LOC133822626, HB‐HD‐ZIP; LOC133817745, MYB; LOC133778335, GRAS; LOC133804487, MADS‐MIKC; LOC133818900, HMG) were normalised using the expression of the housekeeping (reference) gene (LOC133788905, a putative NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 8) identified specifically in this study (Table S56).

AUTHOR CONTRIBUTIONS

MT‐O, BH and MGL conceived and designed the study; MT‐O, OB, RJ and MW performed the experiments; BH, LY, MT‐O, OB and SG performed data analysis. JW, MSD and AB provided critical feedback on experimental design. MT‐O, BH and MGL interpreted the results and wrote the manuscript. All authors critically reviewed the interpretation of data and edited and approved the final manuscript.

CONFLICT OF INTEREST

None of the authors have a conflict of interest to disclose.

Supporting information

Table S1. Tau values assigned to cannabis genes.

Table S2. Tau values assigned to hop genes.

Table S3. Tau values assigned to tomato genes.

Table S4. Number of organ‐specific genes and TFs identified in cannabis, hop and tomato.

Table S5. Public RNA‐seq datasets used to validate organ‐specific transcriptome datasets generated in this study.

Table S6. Tau values calculated by comparing transcriptomes of cannabis trichome‐removed flower, stem, root, leaves and trichome.

Table S7. Cross‐validation of tomato GT‐specific genes (this study) using an independent trichome dataset from Balcke et al. (2017).

Table S8. List of genes differentially expressed between young and mature tomato leaves.

Table S9. Tau values calculated by comparing transcriptomes of tomato flower, stem, root, young leaves and trichomes.

Table S10. The total count of HOGs and number of genes assigned to those HOGs.

Table S11. The number of shared and species‐specific genes across organ types.

Table S12. The conservation level of organ‐specific genes across species.

Table S13. TF family enrichment in GT‐specific transcriptomes of cannabis, hop and tomato.

Table S14. Pairwise comparison of TF family enrichment patterns between species.

Table S15. TF families predicted in the hop genome.

Table S16. TF families predicted in the tomato genome.

Table S17. TF families predicted in the cannabis genome.

Table S18. List of co‐expressed genes identified in cannabis using WGCNA.

Table S19. List of co‐expressed genes identified in hop using WGCNA.

Table S20. List of co‐expressed genes identified in tomato using WGCNA.

Table S21. GT‐specific gene regulatory network for cannabis.

Table S22. GT‐specific gene regulatory network for hop.

Table S23. GT‐specific gene regulatory network for tomato.

Table S24. NMS of genes in the cannabis GT GRN.

Table S25. NMS of genes in the hop GT GRN.

Table S26. NMS of genes in the tomato GT GRN.

Table S27. Motifs enriched in the promoter of the cannabis GT GRN TF target genes.

Table S28. Motifs enriched in the promoter of the hop GT GRN TF target genes.

Table S29. Motifs enriched in the promoter of the tomato GT GRN TF target genes.

Table S30. Gene regulatory subnetwork of cannabis GT‐specific TFs and their targets.

Table S31. Gene regulatory subnetwork of hop GT‐specific TFs and their targets.

Table S32. Gene regulatory subnetwork of tomato GT‐specific TFs and their targets.

Table S33. Cannabis, hop and tomato shared GT‐specific gene regulatory subnetwork.

Table S34. Cannabis and hop shared GT‐specific gene regulatory subnetwork.

Table S35. BWA mapping statistics for DAP‐seq data.

Table S36. Number of DAP‐seq peaks called using MACS2.

Table S37. Peaks associated with nearby DAP‐seq CsHD‐Z879 target genes.

Table S38. Peaks associated with nearby DAP‐seq CsSRS116 target genes.

Table S39. Peaks associated with nearby DAP‐seq SlWO target genes.

Table S40. Distribution of CsHD‐Z879, CsSRS116 and SlWO binding sites across genomic features.

Table S41. Validation of CsHD‐ZIP879 DAP‐seq binding sites using published ChIP‐seq data for histone marks.

Table S42. Validation of CsSRS116 DAP‐seq binding sites using published ChIP‐seq data for histone marks.

Table S43. H3K4me3 ChIP‐seq results for tomato using MACS2.

Table S44. Validation of SlWO DAP‐seq binding sites using published ChIP‐seq data for histone marks (H3K4me3).

Table S45. Cannabis flower‐specific GRN.

Table S46. Cannabis stem‐specific GRN.

Table S47. Cannabis leaf‐specific GRN.

Table S48. Cannabis root‐specific GRN.

Table S49. Cannabis RNA‐seq data used in this study.

Table S50. Hop RNA‐seq data used in this study.

Table S51. Tomato RNA‐seq data used in this study.

Table S52. GT‐specific TFs and targets used to build the cannabis GT‐specific GRN.

Table S53. GT‐specific TFs and targets used to build the hop GT‐specific GRN.

Table S54. GT‐specific TFs and targets used to build the tomato GT‐specific GRN.

Table S55. DAP‐seq data used in this study.

Table S56. Primers used for amplifying candidate TFs CDS for DAP‐seq and qRT‐PCR.

TPJ-125-0-s001.xlsx (60.9MB, xlsx)

Figure S1. Visualisation of isolated trichomes.

Figure S2. PCA plots of RNA‐seq data showing grouping of samples across species.

Figure S3. eFP images showing expression profiles of candidate genes across tissues.

Figure S4. Spearman correlation plots: cannabis flower and trichome RNA‐seq datasets.

Figure S5. Spearman correlation plots: cannabis leaf, stem and root RNA‐seq datasets.

Figure S6. Spearman correlation plots: tomato flower, trichome and stem RNA‐seq datasets.

Figure S7. Spearman correlation plots: tomato root and leaf RNA‐seq datasets.

Figure S8. Spearman correlation plots: hop leaf and trichome RNA‐seq datasets.

Figure S9. Venn diagram showing the overlap between cannabis GT‐specific gene sets.

Figure S10. Organ‐specific expression patterns of tomato GT‐enriched genes.

Figure S11. Quantitative real‐time PCR (qRT‐PCR) validation of candidate hop TFs.

Figure S12. Dot plot of the top 10 enriched GO terms for cannabis organ‐specific genes.

Figure S13. Dot plot of the top 10 enriched GO terms for hop organ‐specific genes.

Figure S14. Dot plot of the top 10 enriched GO terms for tomato organ‐specific genes.

Figure S15. Top enriched GO terms for genes differentially expressed in mature tomato leaves compared to young tomato leaves.

Figure S16. The overlap between two sets of GT genes identified using the transcriptome of young and mature tomato leaves.

Figure S17. Heatmap of the expression pattern of photosynthesis‐related genes in tomato.

Figure S18. The conservation level of organ‐specific genes based on HOG analysis.

Figure S19. Motif enrichment analysis of TF target genes.

Figure S20. Species‐specific GT gene regulatory subnetworks.

Figure S21. Shared GT‐specific gene regulatory subnetworks.

Figure S22. DAP‐seq validation using ChIP‐seq data for histone marks.

Figure S23. Hisat2 mapping statistics for RNA‐seq data used to determine organ specificity of genes.

TPJ-125-0-s002.pdf (7.9MB, pdf)

ACKNOWLEDGEMENTS

We acknowledge Cann Group Limited for kindly providing the cannabis cultivar used in this study, and Dr Filippa Brugliera and Hannah Noorda from Cann Group Limited for assistance with cannabis growing. We thank Dr. Myrna Deseo, Dr. Tracy Stanley and Dr Veronica Borrett from the Australian Research Council (ARC) Research Hub for Medicinal Agriculture for compliance support and Asha Haslem from the La Trobe University Genomics Platform for technical assistance with RNA‐seq library preparation. We thank Dr. Natalie Clark (Broad Institute) for help using SCION. We thank Dr. Onkar Nath for his assistance with preliminary DAP‐seq data analysis. This research was funded by the ARC Research Hub for Medicinal Agriculture (IH180100006) and ARC Research Hub for Protected Cropping (IH240100024). Cann Group Limited is an industry partner organisation of IH180100006. AB and MSD were supported by La Trobe University through the La Trobe Institute for Sustainable Agriculture and Food (LISAF). . Open access publishing facilitated by La Trobe University, as part of the Wiley ‐ La Trobe University agreement via the Council of Australasian University Librarians

We dedicate this paper to the memory of our co‐author and colleague, Lingling Yin, who passed away during the final stages of this work.

DATA AVAILABILITY STATEMENT

The raw RNA‐seq and DAP‐seq data generated in this study have been deposited in the NCBI Sequence Read Archive (SRA) under the following BioProject accessions: PRJNA884161, PRJNA884162, PRJNA1262533, PRJNA1267741, PRJNA1369742, PRJNA1369628, PRJNA988919 and PRJNA988905. Detailed accession numbers for specific sequences are provided in Tables S49–S51 and S55.

Processed gene‐level count and TPM (transcripts per million) data for all genes across cannabis, hop and tomato are available through Zenodo: https://doi.org/10.5281/zenodo.17667485.

Interactive visualisation of gene expression data are accessible through tissue‐specific eFP browsers for cannabis, hop and tomato at: https://expression.latrobe.edu.au/.

DAP‐seq genome browser tracks are available through JBrowse:

REFERENCES

  1. Allocco, D.J. , Kohane, I.S. & Butte, A.J. (2004) Quantifying the relationship between co‐expression, co‐regulation and gene function. BMC Bioinformatics, 5, 18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Andrews, S. (2010) FastQC: a quality control tool for high throughput sequence data. Available from: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/
  3. Bailey, T.L. , Johnson, J. , Grant, C.E. & Noble, W.S. (2015) The MEME suite. Nucleic Acids Research, 43, W39–W49. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Balcke, G.U. , Bennewitz, S. , Bergau, N. , Athmer, B. , Henning, A. , Majovsky, P. et al. (2017) Multi‐omics of tomato glandular trichomes reveals distinct features of central carbon metabolism supporting high productivity of specialized metabolites. The Plant Cell, 29, 960–983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Barba‐Espín, G. , Jurado‐Mañogil, C. , Plskova, Z. , Kerchev, P.I. , Hernández, J.A. & Diaz‐Vivancos, P. (2025) Halophyte‐based crop managements induce biochemical, metabolomic and proteomic changes in tomato plants under saline conditions. Physiologia Plantarum, 177, e70060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bartlett, A. , O'Malley, R.C. , Huang, S.‐s.C. , Galli, M. , Nery, J.R. , Gallavotti, A. et al. (2017) Mapping genome‐wide transcription‐factor binding sites using DAP‐seq. Nature Protocols, 12, 1659–1672. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Batsale, M. , Alonso, M. , Pascal, S. , Thoraval, D. , Haslam, R.P. , Beaudoin, F. et al. (2023) Tackling functional redundancy of Arabidopsis fatty acid elongase complexes. Frontiers in Plant Science, 14, 1107333. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Buchfink, B. , Xie, C. & Huson, D.H. (2015) Fast and sensitive protein alignment using DIAMOND. Nature Methods, 12, 59–60. [DOI] [PubMed] [Google Scholar]
  9. Campos, O.P. , Fortuna, G.C. , de Oliveira Gomes, J.A. , Neves, C.S. & Bonfim, F.P.G. (2023) Morphological characteristics, trichomes, and phytochemistry of inflorescences of ‘Humulus lupulus’ L: comparison of cropping systems and varieties. Australian Journal of Crop Science, 17, 263–274. [Google Scholar]
  10. Cao, Y. , Li, K. , Li, Y. , Zhao, X. & Wang, L. (2020) MYB transcription factors as regulators of secondary metabolism in plants. Biology, 9(3), 61. Available from: 10.3390/biology9030061 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Carlson, M. & Pagès, H. (2022) AnnotationForge: tools for building SQLite‐based annotation data packages. R package version 1.40.0 .
  12. Castro, C.B. , Whittock, L.D. , Whittock, S.P. , Leggett, G. & Koutoulis, A. (2008) DNA sequence and expression variation of Hop (Humulus lupulus) valerophenone synthase (VPS), a key gene in bitter acid biosynthesis. Annals of Botany, 102(2), 265–273. Available from: 10.1093/aob/mcn089 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Chalvin, C. , Drevensek, S. , Dron, M. , Bendahmane, A. & Boualem, A. (2020) Genetic control of glandular trichome development. Trends in Plant Science, 25, 477–487. [DOI] [PubMed] [Google Scholar]
  14. Champagne, A. & Boutry, M. (2017) A comprehensive proteome map of glandular trichomes of hop (Humulus lupulus L.) female cones: identification of biosynthetic pathways of the major terpenoid‐related compounds and possible transport proteins. Proteomics, 17, 1600411. [DOI] [PubMed] [Google Scholar]
  15. Chang, K.N. , Zhong, S. , Weirauch, M.T. , Hon, G. , Pelizzola, M. , Li, H. et al. (2013) Temporal transcriptional response to ethylene gas drives growth hormone cross‐regulation in Arabidopsis. eLife, 2, e00675. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Chen, K. & Rajewsky, N. (2007) The evolution of gene regulation by transcription factors and microRNAs. Nature Reviews Genetics, 8, 93–103. [DOI] [PubMed] [Google Scholar]
  17. Chezem, W.R. & Clay, N.K. (2016) Regulation of plant secondary metabolism and associated specialized cell development by MYBs and bHLHs. Phytochemistry, 131, 26–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Clark, N.M. , Buckner, E. , Fisher, A.P. , Nelson, E.C. , Nguyen, T.T. , Simmons, A.R. et al. (2019) Stem‐cell‐ubiquitous genes spatiotemporally coordinate division through regulation of stem‐cell‐specific gene networks. Nature Communications, 10, 5574. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Clark, N.M. , Nolan, T.M. , Wang, P. , Song, G. , Montes, C. , Valentine, C.T. et al. (2021) Integrated omics networks reveal the temporal signaling events of brassinosteroid response in Arabidopsis. Nature Communications, 12, 1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Clark, S.M. , Vaitheeswaran, V. , Ambrose, S.J. , Purves, R.W. & Page, J.E. (2013) Transcriptome analysis of bitter acid biosynthesis and precursor pathways in hop (Humulus lupulus). BMC Plant Biology, 13, 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Conneely, L.J. , Hurgobin, B. , Ng, S. , Tamiru‐Oli, M. & Lewsey, M.G. (2024) Characterization of the Cannabis sativa glandular trichome epigenome. BMC Plant Biology, 24, 1075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Davies, K.M. , Andre, C.M. , Kulshrestha, S. , Zhou, Y. , Schwinn, K.E. , Albert, N.W. et al. (2024) The evolution of flavonoid biosynthesis. Philosophical Transactions of the Royal Society, B: Biological Sciences, 379, 20230361. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Deal, R.B. & Henikoff, S. (2011) Histone variants and modifications in plant gene regulation. Current Opinion in Plant Biology, 14, 116–122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Emms, D.M. & Kelly, S. (2015) OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biology, 16, 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Emms, D.M. & Kelly, S. (2019) OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biology, 20, 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Eriksen, R.L. , Padgitt‐Cobb, L.K. , Townsend, M.S. & Henning, J.A. (2021) Gene expression for secondary metabolite biosynthesis in hop (Humulus lupulus L.) leaf lupulin glands exposed to heat and low‐water stress. Scientific Reports, 11, 5138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Ewas, M. , Gao, Y. , Wang, S. , Liu, X. , Zhang, H. , Nishawy, E.M. et al. (2016) Manipulation of SlMXl for enhanced carotenoids accumulation and drought resistance in tomato. Science Bulletin, 61, 1413–1418. [Google Scholar]
  28. Fahn, A. (2000) Structure and function of secretory cells. Advances in Botanical Research, 31, 37–75. 10.1016/s0065-2296(00)31006-0. [DOI] [Google Scholar]
  29. Fang, D. , Zhang, W. , Ye, Z. , Hu, F. , Cheng, X. & Cao, J. (2023) The plant specific SHORT INTERNODES/STYLISH (SHI/STY) proteins: structure and functions. Plant Physiology and Biochemistry, 194, 685–695. [DOI] [PubMed] [Google Scholar]
  30. Fasoli, M. , Dal Santo, S. , Zenoni, S. , Tornielli, G.B. , Farina, L. , Zamboni, A. et al. (2012) The grapevine expression atlas reveals a deep transcriptome shift driving the entire plant into a maturation program. The Plant Cell, 24, 3489–3505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Filippis, I. , Lopez‐Cobollo, R. , Abbott, J. , Butcher, S. & Bishop, G.J. (2013) Using a periclinal chimera to unravel layer‐specific gene expression in plants. The Plant Journal, 75, 1039–1049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Flores‐Sanchez, I.J. & Verpoorte, R. (2008) Secondary metabolism in cannabis. Phytochemistry Reviews, 7(3), 615–639. Available from: 10.1007/s11101-008-9094-4 [DOI] [Google Scholar]
  33. Gabaldón, T. & Koonin, E.V. (2013) Functional and evolutionary implications of gene orthology. Nature Reviews Genetics, 14, 360–366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Gagalova, K.K. , Yan, Y. , Wang, S. , Matzat, T. , Castellarin, S.D. , Birol, I. et al. (2024) Leaf pigmentation in Cannabis sativa: characterization of anthocyanin biosynthesis in colorful cannabis varieties. Plant Direct, 8, e70016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Gagne, S.J. , Stout, J.M. , Liu, E. , Boubakir, Z. , Clark, S.M. & Page, J.E. (2012) Identification of olivetolic acid cyclase from Cannabis sativa reveals a unique catalytic route to plant polyketides. Proceedings of the National Academy of Sciences of the United States of America, 109, 12811–12816. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Ghosh, S. (2017) Triterpene structural diversification by plant cytochrome P450 enzymes. Frontiers in Plant Science, 8, 1886. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Goese, M. , Kammhuber, K. , Bacher, A. , Zenk, M.H. & Eisenreich, W. (1999) Biosynthesis of bitter acids in hops: a 13C‐NMR and 2H‐NMR study on the building blocks of humulone. European Journal of Biochemistry, 263, 447–454. [DOI] [PubMed] [Google Scholar]
  38. Gonçalves, J. , Rosado, T. , Soares, S. , Simão, A.Y. , Caramelo, D. , Luís, Â. et al. (2019) Cannabis and its secondary metabolites: their use as therapeutic drugs, toxicological aspects, and analytical determination. Medicines (Basel), 6, 31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Grassa, C.J. , Weiblen, G.D. , Wenger, J.P. , Dabney, C. , Poplawski, S.G. , Timothy Motley, S. et al. (2021) A new cannabis genome assembly associates elevated cannabidiol (CBD) with hemp introgressed into marijuana. New Phytologist, 230, 1665–1679. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Heinz, S. , Benner, C. , Spann, N. , Bertolino, E. , Lin, Y.C. , Laslo, P. et al. (2010) Simple combinations of lineage‐determining transcription factors prime cis‐regulatory elements required for macrophage and B cell identities. Molecular Cell, 38, 576–589. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Hosmani, P.S. , Flores‐Gonzalez, M. , van de Geest, H. , Maumus, F. , Bakker, L.V. , Schijlen, E. et al. (2019) An improved de novo assembly and annotation of the tomato reference genome using single‐molecule sequencing, hi‐C proximity ligation and optical maps. BioRxiv, 767764.
  42. Hua, B. , Chang, J. , Wu, M. , Xu, Z. , Zhang, F. , Yang, M. et al. (2021) Mediation of JA signalling in glandular trichomes by the woolly/SlMYC1 regulatory module improves pest resistance in tomato. Plant Biotechnology Journal, 19, 375–393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Hua, B. , Chang, J. , Xu, Z. , Han, X. , Xu, M. , Yang, M. et al. (2021) HOMEODOMAIN PROTEIN8 mediates jasmonate‐triggered trichome elongation in tomato. New Phytologist, 230, 1063–1077. [DOI] [PubMed] [Google Scholar]
  44. Huang, L. & Schiefelbein, J. (2015) Conserved gene expression programs in developing roots from diverse plants. The Plant Cell, 27, 2119–2132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Huang, L.‐C. , Pu, S.‐Y. , Murashige, T. , Fu, S.‐F. , Kuo, T.‐T. , Huang, D.‐D. et al. (2003) Phase‐and age‐related differences in protein tyrosine phosphorylation in Sequoia sempervirens . Biologia Plantarum, 47, 601–603. [Google Scholar]
  46. Huchelmann, A. , Boutry, M. & Hachez, C. (2017) Plant glandular trichomes: natural cell factories of high biotechnological interest. Plant Physiology, 175, 6–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Hurgobin, B. & Lewsey, M.G. (2022) Applications of cell‐and tissue‐specific ‘omics to improve plant productivity’. Emerging Topics in Life Sciences, 6, 163–173. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Huynh‐Thu, V.A. , Irrthum, A. , Wehenkel, L. & Geurts, P. (2010) Inferring regulatory networks from expression data using tree‐based methods. PLoS One, 5, e12776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Innes, P.A. & Vergara, D. (2023) Genomic description of critical cannabinoid biosynthesis genes. Botany, 101, 270–283. [Google Scholar]
  50. Jeong, H.J. & Jung, K.H. (2015) Rice tissue‐specific promoters and condition‐dependent promoters for effective translational application. Journal of Integrative Plant Biology, 57, 913–924. [DOI] [PubMed] [Google Scholar]
  51. Jiao, W. , Wang, M. , Guan, Y. , Guo, W. , Zhang, C. , Wei, Y. et al. (2024) Transcriptional regulatory network reveals key transcription factors for regulating agronomic traits in soybean. Genome Biology, 25, 1–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Julca, I. , Ferrari, C. , Flores‐Tornero, M. , Proost, S. , Lindner, A.‐C. , Hackenberg, D. et al. (2021) Comparative transcriptomic analysis reveals conserved programmes underpinning organogenesis and reproduction in land plants. Nature Plants, 7, 1143–1159. [DOI] [PubMed] [Google Scholar]
  53. Julca, I. , Tan, Q.W. & Mutwil, M. (2022) Toward kingdom‐wide analyses of gene expression. Trends in Plant Science, 28, 235–249. [DOI] [PubMed] [Google Scholar]
  54. Kamenetsky, R. , Faigenboim, A. , Shemesh Mayer, E. , Ben Michael, T. , Gershberg, C. , Kimhi, S. et al. (2015) Integrated transcriptome catalogue and organ‐specific profiling of gene expression in fertile garlic (Allium sativum L.). BMC Genomics, 16, 1–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Kang, J.‐H. , McRoberts, J. , Shi, F. , Moreno, J.E. , Jones, A.D. & Howe, G.A. (2014) The flavonoid biosynthetic enzyme chalcone isomerase modulates terpenoid production in glandular trichomes of tomato. Plant Physiology, 164, 1161–1174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Kim, D. , Paggi, J.M. , Park, C. , Bennett, C. & Salzberg, S.L. (2019) Graph‐based genome alignment and genotyping with HISAT2 and HISAT‐genotype. Nature Biotechnology, 37, 907–915. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Korpelainen, H. & Pietiläinen, M. (2021) Hop (Humulus lupulus L.): traditional and present use, and future potential. Economic Botany, 75, 302–322. [Google Scholar]
  58. Kovalchuk, I. , Pellino, M. , Rigault, P. , Van Velzen, R. , Ebersbach, J. , Ashnest, J. et al. (2020) The genomics of cannabis and its close relatives. Annual Review of Plant Biology, 71, 713–739. [DOI] [PubMed] [Google Scholar]
  59. Kryuchkova‐Mostacci, N. & Robinson‐Rechavi, M. (2017) A benchmark of gene expression tissue‐specificity metrics. Briefings in Bioinformatics, 18, 205–214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Langfelder, P. & Horvath, S. (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics, 9, 559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Lashbrooke, J. , Adato, A. , Lotan, O. , Alkan, N. , Tsimbalist, T. , Rechav, K. et al. (2015) The tomato MIXTA‐like transcription factor coordinates fruit epidermis conical cell development and cuticular lipid biosynthesis and assembly. Plant Physiology, 169, 2553–2571. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Li, H. (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA‐MEM. arXiv [Preprint], 1303.3997.
  63. Li, H. & Durbin, R. (2009) Fast and accurate short read alignment with burrows–wheeler transform. Bioinformatics, 25, 1754–1760. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Li, H. & Durbin, R. (2010) Fast and accurate long‐read alignment with burrows–wheeler transform. Bioinformatics, 26, 589–595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Li, H. , Handsaker, B. , Wysoker, A. , Fennell, T. , Ruan, J. , Homer, N. et al. (2009) The sequence alignment/map format and SAMtools. Bioinformatics, 25, 2078–2079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Li, M. , Fu, Y. , Liang, X. , Liu, X. , Huang, L. , Zhang, S. et al. (2025) SlHDZIV3 and SlHDZIV9 are two positive regulators in the formation of tomato trichomes. Plant, Cell & Environment, 48, 5789–5801. [DOI] [PubMed] [Google Scholar]
  67. Li, M. , Yao, T. , Lin, W. , Hinckley, W.E. , Galli, M. , Muchero, W. et al. (2023) Double DAP‐seq uncovered synergistic DNA binding of interacting bZIP transcription factors. Nature Communications, 14, 2600. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Li, P. , Xia, E. , Fu, J. , Xu, Y. , Zhao, X. , Tong, W. et al. (2022) Diverse roles of MYB transcription factors in regulating secondary metabolite biosynthesis, shoot development, and stress responses in tea plants (Camellia sinensis). The Plant Journal, 110, 1144–1165. Available from: 10.1111/tpj.15729 [DOI] [PubMed] [Google Scholar]
  69. Li, Q. , Cao, C. , Zhang, C. , Zheng, S. , Wang, Z. , Wang, L. et al. (2015) The identification of Cucumis sativus Glabrous 1 (CsGL1) required for the formation of trichomes uncovers a novel function for the homeodomain‐leucine zipper I gene. Journal of Experimental Botany, 66, 2515–2526. [DOI] [PubMed] [Google Scholar]
  70. Li, X.‐C. , Zhu, J. , Yang, J. , Zhang, G.‐R. , Xing, W.‐F. , Zhang, S. et al. (2012) Glycerol‐3‐phosphate acyltransferase 6 (GPAT6) is important for tapetum development in Arabidopsis and plays multiple roles in plant fertility. Molecular Plant, 5, 131–142. [DOI] [PubMed] [Google Scholar]
  71. Lin, C.L. , García‐Caro, R.d.l.C. , Zhang, P. , Carlin, S. , Gottlieb, A. , Petersen, M.A. et al. (2021) Packing a punch: understanding how flavours are produced in lager fermentations. FEMS Yeast Research, 21, foab040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Liu, H. , Li, L. , Fu, X. , Li, Y. , Chen, T. , Qin, W. et al. (2023) AaMYB108 is the core factor integrating light and jasmonic acid signaling to regulate artemisinin biosynthesis in Artemisia annua . New Phytologist, 237, 2224–2237. [DOI] [PubMed] [Google Scholar]
  73. Liu, Z. , Li, Z. , Wu, S. , Yu, C. , Wang, X. , Wang, Y. et al. (2022) Coronatine enhances chilling tolerance of tomato plants by inducing chilling‐related epigenetic adaptations and transcriptional reprogramming. International Journal of Molecular Sciences, 23, 10049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Livingston, S.J. , Bae, E.J. , Unda, F. , Hahn, M.G. , Mansfield, S.D. , Page, J.E. et al. (2021) Cannabis glandular trichome cell walls undergo remodeling to store specialized metabolites. Plant and Cell Physiology, 62, 1944–1962. [DOI] [PubMed] [Google Scholar]
  75. Lowe, H. , Steele, B. , Bryant, J. , Toyang, N. & Ngwa, W. (2021) Non‐cannabinoid metabolites of Cannabis sativa L. with therapeutic potential. Plants, 10, 400. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Luo, X. , Reiter, M.A. , d'Espaux, L. , Wong, J. , Denby, C.M. , Lechner, A. et al. (2019) Complete biosynthesis of cannabinoids and their unnatural analogues in yeast. Nature, 567, 123–126. [DOI] [PubMed] [Google Scholar]
  77. Lv, J. , Deng, M. , Jiang, S. , Zhu, H. , Li, Z. , Wang, Z. et al. (2022) Mapping and functional characterization of the tomato spotted wilt virus resistance gene SlCHS3 in Solanum lycopersicum . Molecular Breeding, 42, 55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Ma, G. , Zelman, A.K. , Apicella, P.V. & Berkowitz, G. (2022) Genome‐wide identification and expression analysis of homeodomain leucine zipper subfamily IV (HD‐ZIP IV) gene family in Cannabis sativa L. Plants, 11, 1307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Mandal, S. , Rezenom, Y.H. & McKnight, T.D. (2022) ASTF1, an AP2/ERF‐family transcription factor and ortholog of cultivated tomato LEAFLESS, is required for acylsugar biosynthesis. bioRxiv. 10.1101/2022.04.04.487036 [DOI]
  80. Marbach, D. , Costello, J.C. , Küffner, R. , Vega, N.M. , Prill, R.J. , Camacho, D.M. et al. (2012) Wisdom of crowds for robust gene network inference. Nature Methods, 9, 796–804. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. McDowell, E.T. , Kapteyn, J. , Schmidt, A. , Li, C. , Kang, J.‐H. , Descour, A. et al. (2011) Comparative functional genomic analysis of Solanum glandular trichome types. Plant Physiology, 155, 524–539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Meng, Y. , Zhang, A. , Ma, Q. & Xing, L. (2022) Functional characterization of tomato ShROP7 in regulating resistance against Oidium neolycopersici . International Journal of Molecular Sciences, 23, 8557. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Milo, R. , Shen‐Orr, S. , Itzkovitz, S. , Kashtan, N. , Chklovskii, D. & Alon, U. (2002) Network motifs: simple building blocks of complex networks. Science, 298, 824–827. [DOI] [PubMed] [Google Scholar]
  84. Mizuno, K. , Kato, M. , Irino, F. , Yoneyama, N. , Fujimura, T. & Ashihara, H. (2003) The first committed step reaction of caffeine biosynthesis: 7‐methylxanthosine synthase is closely homologous to caffeine synthases in coffee (Coffea arabica L.). FEBS Letters, 547(1–3), 56–60. Available from: 10.1016/s0014-5793(03)00670-7 [DOI] [PubMed] [Google Scholar]
  85. Morales, P. , González, M. , Salvatierra‐Martínez, R. , Araya, M. , Ostria‐Gallardo, E. & Stoll, A. (2022) New insights into bacillus‐primed plant responses to a necrotrophic pathogen derived from the tomato‐botrytis pathosystem. Microorganisms, 10, 1547. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Nakamura, M. , Katsumata, H. , Abe, M. , Yabe, N. , Komeda, Y. , Yamamoto, K.T. et al. (2006) Characterization of the class IV homeodomain‐leucine zipper gene family in Arabidopsis. Plant Physiology, 141, 1363–1375. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. O'Malley, R.C. , Huang, S.‐s.C. , Song, L. , Lewsey, M.G. , Bartlett, A. , Nery, J.R. et al. (2016) Cistrome and epicistrome features shape the regulatory DNA landscape. Cell, 165, 1280–1292. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Ó'Maoiléidigh, D.S. , Graciet, E. & Wellmer, F. (2014) Gene networks controlling Arabidopsis thaliana flower development. New Phytologist, 201, 16–30. [DOI] [PubMed] [Google Scholar]
  89. O'Neill, S.D. , Tong, Y. , Spörlein, B. , Forkmann, G. & Yoder, J.I. (1990) Molecular genetic analysis of chalcone synthase in Lycopersicon esculentum and an anthocyanin‐deficient mutant. Molecular and General Genetics MGG, 224, 279–288. [DOI] [PubMed] [Google Scholar]
  90. Ono, E. & Murata, J. (2023) Exploring the evolvability of plant specialized metabolism: uniqueness out of uniformity and uniqueness behind uniformity. Plant and Cell Physiology, 64, 1449–1465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Padgitt‐Cobb, L.K. , Kingan, S.B. , Wells, J. , Elser, J. , Kronmiller, B. , Moore, D. et al. (2021) A draft phased assembly of the diploid cascade hop (Humulus lupulus) genome. The Plant Genome, 14, e20072. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Padgitt‐Cobb, L.K. , Pitra, N.J. , Matthews, P.D. , Henning, J.A. & Hendrix, D.A. (2023) An improved assembly of the “Cascade” hop (Humulus lupulus) genome uncovers signatures of molecular evolution and refines time of divergence estimates for the Cannabaceae family. Horticulture Research, 10, uhac281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Patro, R. , Duggal, G. , Love, M.I. , Irizarry, R.A. & Kingsford, C. (2017) Salmon provides fast and bias‐aware quantification of transcript expression. Nature Methods, 14, 417–419. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Pattanaik, S. , Patra, B. , Singh, S.K. & Yuan, L. (2014) An overview of the gene regulatory network controlling trichome development in the model plant, Arabidopsis. Frontiers in Plant Science, 5, 259. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Pattnaik, F. , Nanda, S. , Mohanty, S. , Dalai, A.K. , Kumar, V. , Ponnusamy, S.K. et al. (2022) Cannabis: chemistry, extraction and therapeutic applications. Chemosphere, 289, 133012. [DOI] [PubMed] [Google Scholar]
  96. Patzak, J. , Henychová, A. , Krofta, K. , Svoboda, P. & Malířová, I. (2021) The influence of hop latent viroid (HLVd) infection on gene expression and secondary metabolite contents in hop (Humulus lupulus L.) glandular trichomes. Plants, 10, 2297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Patzak, J. , Krofta, K. , Henychová, A. & Nesvadba, V. (2015) Number and size of lupulin glands, glandular trichomes of hop (Humulus lupulus L.), play a key role in contents of bitter acids and polyphenols in hop cone. International Journal of Food Science & Technology, 50, 1864–1872. [Google Scholar]
  98. Paul, P. , Chaturvedi, P. , Selymesi, M. , Ghatak, A. , Mesihovic, A. , Scharf, K.‐D. et al. (2016) The membrane proteome of male gametophyte in Solanum lycopersicum . Journal of Proteomics, 131, 48–60. [DOI] [PubMed] [Google Scholar]
  99. Pertea, M. , Pertea, G.M. , Antonescu, C.M. , Chang, T.‐C. , Mendell, J.T. & Salzberg, S.L. (2015) StringTie enables improved reconstruction of a transcriptome from RNA‐seq reads. Nature Biotechnology, 33, 290–295. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Petit, J. , Bres, C. , Mauxion, J.‐P. , Tai, F.W.J. , Martin, L.B. , Fich, E.A. et al. (2016) The glycerol‐3‐phosphate acyltransferase GPAT6 from tomato plays a central role in fruit cutin biosynthesis. Plant Physiology, 171, 894–913. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Qin, W. , Xie, L. , Li, Y. , Liu, H. , Fu, X. , Chen, T. et al. (2021) An R2R3‐MYB transcription factor positively regulates the glandular secretory trichome initiation in Artemisia annua L. Frontiers in Plant Science, 12, 657156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  102. Quinlan, A.R. & Hall, I.M. (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 26, 841–842. [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Rinnone, F. , Micale, G. , Bonnici, V. , Bader, G.D. , Shasha, D. , Ferro, A. et al. (2015) NetMatchStar: an enhanced Cytoscape network querying app. F1000Research, 4, 479. [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Saadat, N.P. , van Aalst, M. , Brand, A. , Ebenhöh, O. , Tissier, A. & Matuszyńska, A.B. (2023) Shifts in carbon partitioning by photosynthetic activity increase terpenoid synthesis in glandular trichomes. The Plant Journal, 115, 1716–1728. [DOI] [PubMed] [Google Scholar]
  105. Schilmiller, A.L. , Schauvinhold, I. , Larson, M. , Xu, R. , Charbonneau, A.L. , Schmidt, A. et al. (2009) Monoterpenes in the glandular trichomes of tomato are synthesized from a neryl diphosphate precursor rather than geranyl diphosphate. Proceedings of the National Academy of Sciences, 106, 10865–10870. [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Schrick, K. , Ahmad, B. & Nguyen, H.V. (2023) HD‐zip IV transcription factors: drivers of epidermal cell fate integrate metabolic signals. Current Opinion in Plant Biology, 75, 102417. [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Schuurink, R. & Tissier, A. (2020) Glandular trichomes: micro‐organs with model status? New Phytologist, 225, 2251–2266. [DOI] [PubMed] [Google Scholar]
  108. Shannon, P. , Markiel, A. , Ozier, O. , Baliga, N.S. , Wang, J.T. , Ramage, D. et al. (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Research, 13, 2498–2504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. Shi, P. , Fu, X. , Shen, Q. , Liu, M. , Pan, Q. , Tang, Y. et al. (2018) The roles of aa MIXTA 1 in regulating the initiation of glandular trichomes and cuticle biosynthesis in Artemisia annua . New Phytologist, 217, 261–276. [DOI] [PubMed] [Google Scholar]
  110. Shiu, S.‐H. , Shih, M.‐C. & Li, W.‐H. (2005) Transcription factor families have much higher expansion rates in plants than in animals. Plant Physiology, 139, 18–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  111. Song, L. , Huang, S.‐S.C. , Wise, A. , Castanon, R. , Nery, J.R. , Chen, H. et al. (2016) A transcription factor hierarchy defines an environmental stress response network. Science, 354, aag1550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  112. Spyropoulou, E.A. , Haring, M.A. & Schuurink, R.C. (2014) Expression of terpenoids 1, a glandular trichome‐specific transcription factor from tomato that activates the terpene synthase 5 promoter. Plant Molecular Biology, 84, 345–357. [DOI] [PubMed] [Google Scholar]
  113. Stout, J.M. , Boubakir, Z. , Ambrose, S.J. , Purves, R.W. & Page, J.E. (2012) The hexanoyl‐CoA precursor for cannabinoid biosynthesis is formed by an acyl‐activating enzyme in Cannabis sativa trichomes. The Plant Journal, 71, 353–365. [DOI] [PubMed] [Google Scholar]
  114. Sugimoto, K. , Zager, J.J. , Aubin, B.S. , Lange, B.M. & Howe, G.A. (2022) Flavonoid deficiency disrupts redox homeostasis and terpenoid biosynthesis in glandular trichomes of tomato. Plant Physiology, 188, 1450–1468. [DOI] [PMC free article] [PubMed] [Google Scholar]
  115. Sugiyama, R. , Oda, H. & Kurosaki, F. (2006) Two distinct phases of glandular trichome development in hop (Humulus lupulus L.). Plant Biotechnology, 23, 493–496. [Google Scholar]
  116. Sun, B. , Zhu, Z. , Chen, C. , Chen, G. , Cao, B. , Chen, C. et al. (2019) Jasmonate‐inducible R2R3‐MYB transcription factor regulates capsaicinoid biosynthesis and stamen development in capsicum. Journal of Agricultural and Food Chemistry, 67, 10891–10903. [DOI] [PubMed] [Google Scholar]
  117. Team, R.C. (2020) R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. [Google Scholar]
  118. Thimm, O. , Bläsing, O. , Gibon, Y. , Nagel, A. , Meyer, S. , Krüger, P. et al. (2004) MAPMAN: a user‐driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. The Plant Journal, 37, 914–939. [DOI] [PubMed] [Google Scholar]
  119. Tohge, T. , de Souza, L.P. & Fernie, A.R. (2017) Current understanding of the pathways of flavonoid biosynthesis in model and crop plants. Journal of Experimental Botany, 68, 4013–4028. [DOI] [PubMed] [Google Scholar]
  120. Tohge, T. , Scossa, F. , Wendenburg, R. , Frasse, P. , Balbo, I. , Watanabe, M. et al. (2020) Exploiting natural variation in tomato to define pathway structure and metabolic regulation of fruit polyphenolics in the lycopersicum complex. Molecular Plant, 13, 1027–1046. [DOI] [PubMed] [Google Scholar]
  121. Törönen, P. & Holm, L. (2022) PANNZER—a practical tool for protein function prediction. Protein Science, 31, 118–128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  122. United Nations . (1961) Single convention on narcotic drugs . New York, 30 March 1961. Available from: https://www.unodc.org/pdf/convention_1961_en.pdf
  123. Van den Broeck, L. , Gordon, M. , Inzé, D. , Williams, C. & Sozzani, R. (2020) Gene regulatory network inference: connecting plant biology and mathematical modeling. Frontiers in Genetics, 11, 457. [DOI] [PMC free article] [PubMed] [Google Scholar]
  124. van Velzen, R. & Schranz, M.E. (2021) Origin and evolution of the cannabinoid oxidocyclase gene family. Genome Biology and Evolution, 13, evab130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  125. Varala, K. , Marshall‐Colón, A. , Cirrone, J. , Brooks, M.D. , Pasquino, A.V. , Léran, S. et al. (2018) Temporal transcriptional logic of dynamic regulatory networks underlying nitrogen signaling and use in plants. Proceedings of the National Academy of Sciences of the United States of America, 115, 6494–6499. [DOI] [PMC free article] [PubMed] [Google Scholar]
  126. Vincent, D. , Rochfort, S. & Spangenberg, G. (2019) Optimisation of protein extraction from medicinal cannabis mature buds for bottom‐up proteomics. Molecules (Basel, Switzerland), 24, 659. [DOI] [PMC free article] [PubMed] [Google Scholar]
  127. Vogt, T. (2010) Phenylpropanoid biosynthesis. Molecular Plant, 3, 2–20. [DOI] [PubMed] [Google Scholar]
  128. Wagner, G. , Wang, E. & Shepherd, R. (2004) New approaches for studying and exploiting an old protuberance, the plant trichome. Annals of Botany, 93, 3–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  129. Walley, J.W. , Sartor, R.C. , Shen, Z. , Schmitz, R.J. , Wu, K.J. , Urich, M.A. et al. (2016) Integration of omic networks in a developmental atlas of maize. Science (New York, N.Y.), 353, 814–818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  130. Wang, C. , Dai, S. , Zhang, Z.L. , Lao, W. , Wang, R. , Meng, X. et al. (2021) Ethylene and salicylic acid synergistically accelerate leaf senescence in Arabidopsis. Journal of Integrative Plant Biology, 63, 828–833. [DOI] [PubMed] [Google Scholar]
  131. Wang, J. , Li, G. , Li, C. , Zhang, C. , Cui, L. , Ai, G. et al. (2021) NF‐Y plays essential roles in flavonoid biosynthesis by modulating histone modifications in tomato. New Phytologist, 229, 3237–3252. [DOI] [PubMed] [Google Scholar]
  132. Wang, J. , Yuan, S. , Zhao, Y. , Shu, X. , Liu, Z. , Wang, T. et al. (2025) Wo interacts with SlTCP25 to regulate type I trichome branching in tomato. Horticulture Research, 12, uhaf032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  133. Wang, Y. , Gao, R. , Gu, T. , Li, X. , Wang, M. , Wang, A. et al. (2025) Metabolomics and transcriptomics reveal the role of the terpene biosynthetic pathway in the mechanism of insect resistance in Solanum habrochaites . Journal of Agricultural and Food Chemistry, 73, 6253–6269. [DOI] [PubMed] [Google Scholar]
  134. Weirauch, M.T. , Yang, A. , Albu, M. , Cote, A.G. , Montenegro‐Montero, A. , Drewe, P. et al. (2014) Determination and inference of eukaryotic transcription factor sequence specificity. Cell, 158, 1431–1443. [DOI] [PMC free article] [PubMed] [Google Scholar]
  135. Wickham, H. & Wickham, H. (2016) Data analysis. ggplot2: elegant graphics for data analysis. New York: Springer‐Verlag, pp. 189–201. [Google Scholar]
  136. Wu, M. , Chang, J. , Han, X. , Shen, J. , Yang, L. , Hu, S. et al. (2023) A HD‐ZIP transcription factor specifies fates of multicellular trichomes via dosage‐dependent mechanisms in tomato. Developmental Cell, 58, 278–288. e275. [DOI] [PubMed] [Google Scholar]
  137. Xu, H. , Teng, H. , Zhang, B. , Liu, W. , Sui, Y. , Yan, X. et al. (2024) NtHD9 modulates plant salt tolerance by regulating the formation of glandular trichome heads in Nicotiana tabacum . Plant Physiology and Biochemistry, 212, 108765. [DOI] [PubMed] [Google Scholar]
  138. Xu, H. , Zhang, F. , Liu, B. , Huhman, D.V. , Sumner, L.W. , Dixon, R.A. et al. (2013) Characterization of the formation of branched short‐chain fatty acid: CoAs for bitter acid biosynthesis in hop glandular trichomes. Molecular Plant, 6, 1301–1317. [DOI] [PubMed] [Google Scholar]
  139. Xu, J. , van Herwijnen, Z.O. , Dräger, D.B. , Sui, C. , Haring, M.A. & Schuurink, R.C. (2018) SlMYC1 regulates type VI glandular trichome formation and terpene biosynthesis in tomato glandular cells. The Plant Cell, 30, 2988–3005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  140. Xu, Y. , Zhang, J. , Tang, Q. , Dai, Z. , Deng, C. , Chen, Y. et al. (2024) Integrated metabolomic and transcriptomic analysis revealed the regulation of yields, cannabinoid, and terpene biosynthesis in Cannabis sativa L. under different photoperiods. South African Journal of Botany, 174, 735–746. [Google Scholar]
  141. Yang, C. , Marillonnet, S. & Tissier, A. (2021) The scarecrow‐like transcription factor SlSCL3 regulates volatile terpene biosynthesis and glandular trichome size in tomato (Solanum lycopersicum). The Plant Journal, 107, 1102–1118. [DOI] [PubMed] [Google Scholar]
  142. Yerger, E.H. , Grazzini, R.A. , Hesk, D. , Cox‐Foster, D.L. , Craig, R. & Mumma, R.O. (1992) A rapid method for isolating glandular trichomes. Plant Physiology, 99, 1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  143. Yin, W. , Mendoza, L. , Monzon‐Sandoval, J. , Urrutia, A.O. & Gutierrez, H. (2021) Emergence of co‐expression in gene regulatory networks. PLoS One, 16, e0247671. [DOI] [PMC free article] [PubMed] [Google Scholar]
  144. Yu, G. , Wang, L.‐G. , Han, Y. & He, Q.‐Y. (2012) clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS: A Journal of Integrative Biology, 16, 284–287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  145. Yuan, M. , Sheng, Y. , Bao, J. , Wu, W. , Nie, G. , Wang, L. et al. (2025) AaMYC3 bridges the regulation of glandular trichome density and artemisinin biosynthesis in Artemisia annua . Plant Biotechnology Journal, 23, 315–332. [DOI] [PMC free article] [PubMed] [Google Scholar]
  146. Zander, M. , Lewsey, M.G. , Clark, N.M. , Yin, L. , Bartlett, A. , Saldierna Guzmán, J.P. et al. (2020) Integrated multi‐omics framework of the plant response to jasmonic acid. Nature Plants, 6, 290–302. [DOI] [PMC free article] [PubMed] [Google Scholar]
  147. Zeng, Z. , Zhang, W. , Marand, A.P. , Zhu, B. , Buell, C.R. & Jiang, J. (2019) Cold stress induces enhanced chromatin accessibility and bivalent histone modifications H3K4me3 and H3K27me3 of active genes in potato. Genome Biology, 20, 123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  148. Zhang, Y. , Li, Z. , Liu, J. , Zhang, Y.e. , Ye, L. , Peng, Y. et al. (2022) Transposable elements orchestrate subgenome‐convergent and‐divergent transcription in common wheat. Nature Communications, 13, 6940. [DOI] [PMC free article] [PubMed] [Google Scholar]
  149. Zhang, Y. , Liu, T. , Meyer, C.A. , Eeckhoute, J. , Johnson, D.S. , Bernstein, B.E. et al. (2008) Model‐based analysis of ChIP‐seq (MACS). Genome Biology, 9, 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  150. Zhao, Q. , Li, M. , Zhang, M. & Tan, H. (2024) Glandular trichomes: the factory of artemisinin biosynthesis. Medicinal Plant Biology, 3, e019. [Google Scholar]
  151. Zheng, Y. , Jiao, C. , Sun, H. , Rosli, H.G. , Pombo, M.A. , Zhang, P. et al. (2016) iTAK: a program for genome‐wide prediction and classification of plant transcription factors, transcriptional regulators, and protein kinases. Molecular Plant, 9, 1667–1670. [DOI] [PubMed] [Google Scholar]
  152. Zocca, P. , van Doore, E. , Roovers, A.J. , Glas, J.J. , Uittenbogaard, M. , Verlaan, M.G. et al. (2025) Glandless, a tomato HD‐ZIP transcription factor, is important for the gland formation of type VI trichomes. The Plant Journal, 123, e70308. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Table S1. Tau values assigned to cannabis genes.

Table S2. Tau values assigned to hop genes.

Table S3. Tau values assigned to tomato genes.

Table S4. Number of organ‐specific genes and TFs identified in cannabis, hop and tomato.

Table S5. Public RNA‐seq datasets used to validate organ‐specific transcriptome datasets generated in this study.

Table S6. Tau values calculated by comparing transcriptomes of cannabis trichome‐removed flower, stem, root, leaves and trichome.

Table S7. Cross‐validation of tomato GT‐specific genes (this study) using an independent trichome dataset from Balcke et al. (2017).

Table S8. List of genes differentially expressed between young and mature tomato leaves.

Table S9. Tau values calculated by comparing transcriptomes of tomato flower, stem, root, young leaves and trichomes.

Table S10. The total count of HOGs and number of genes assigned to those HOGs.

Table S11. The number of shared and species‐specific genes across organ types.

Table S12. The conservation level of organ‐specific genes across species.

Table S13. TF family enrichment in GT‐specific transcriptomes of cannabis, hop and tomato.

Table S14. Pairwise comparison of TF family enrichment patterns between species.

Table S15. TF families predicted in the hop genome.

Table S16. TF families predicted in the tomato genome.

Table S17. TF families predicted in the cannabis genome.

Table S18. List of co‐expressed genes identified in cannabis using WGCNA.

Table S19. List of co‐expressed genes identified in hop using WGCNA.

Table S20. List of co‐expressed genes identified in tomato using WGCNA.

Table S21. GT‐specific gene regulatory network for cannabis.

Table S22. GT‐specific gene regulatory network for hop.

Table S23. GT‐specific gene regulatory network for tomato.

Table S24. NMS of genes in the cannabis GT GRN.

Table S25. NMS of genes in the hop GT GRN.

Table S26. NMS of genes in the tomato GT GRN.

Table S27. Motifs enriched in the promoter of the cannabis GT GRN TF target genes.

Table S28. Motifs enriched in the promoter of the hop GT GRN TF target genes.

Table S29. Motifs enriched in the promoter of the tomato GT GRN TF target genes.

Table S30. Gene regulatory subnetwork of cannabis GT‐specific TFs and their targets.

Table S31. Gene regulatory subnetwork of hop GT‐specific TFs and their targets.

Table S32. Gene regulatory subnetwork of tomato GT‐specific TFs and their targets.

Table S33. Cannabis, hop and tomato shared GT‐specific gene regulatory subnetwork.

Table S34. Cannabis and hop shared GT‐specific gene regulatory subnetwork.

Table S35. BWA mapping statistics for DAP‐seq data.

Table S36. Number of DAP‐seq peaks called using MACS2.

Table S37. Peaks associated with nearby DAP‐seq CsHD‐Z879 target genes.

Table S38. Peaks associated with nearby DAP‐seq CsSRS116 target genes.

Table S39. Peaks associated with nearby DAP‐seq SlWO target genes.

Table S40. Distribution of CsHD‐Z879, CsSRS116 and SlWO binding sites across genomic features.

Table S41. Validation of CsHD‐ZIP879 DAP‐seq binding sites using published ChIP‐seq data for histone marks.

Table S42. Validation of CsSRS116 DAP‐seq binding sites using published ChIP‐seq data for histone marks.

Table S43. H3K4me3 ChIP‐seq results for tomato using MACS2.

Table S44. Validation of SlWO DAP‐seq binding sites using published ChIP‐seq data for histone marks (H3K4me3).

Table S45. Cannabis flower‐specific GRN.

Table S46. Cannabis stem‐specific GRN.

Table S47. Cannabis leaf‐specific GRN.

Table S48. Cannabis root‐specific GRN.

Table S49. Cannabis RNA‐seq data used in this study.

Table S50. Hop RNA‐seq data used in this study.

Table S51. Tomato RNA‐seq data used in this study.

Table S52. GT‐specific TFs and targets used to build the cannabis GT‐specific GRN.

Table S53. GT‐specific TFs and targets used to build the hop GT‐specific GRN.

Table S54. GT‐specific TFs and targets used to build the tomato GT‐specific GRN.

Table S55. DAP‐seq data used in this study.

Table S56. Primers used for amplifying candidate TFs CDS for DAP‐seq and qRT‐PCR.

TPJ-125-0-s001.xlsx (60.9MB, xlsx)

Figure S1. Visualisation of isolated trichomes.

Figure S2. PCA plots of RNA‐seq data showing grouping of samples across species.

Figure S3. eFP images showing expression profiles of candidate genes across tissues.

Figure S4. Spearman correlation plots: cannabis flower and trichome RNA‐seq datasets.

Figure S5. Spearman correlation plots: cannabis leaf, stem and root RNA‐seq datasets.

Figure S6. Spearman correlation plots: tomato flower, trichome and stem RNA‐seq datasets.

Figure S7. Spearman correlation plots: tomato root and leaf RNA‐seq datasets.

Figure S8. Spearman correlation plots: hop leaf and trichome RNA‐seq datasets.

Figure S9. Venn diagram showing the overlap between cannabis GT‐specific gene sets.

Figure S10. Organ‐specific expression patterns of tomato GT‐enriched genes.

Figure S11. Quantitative real‐time PCR (qRT‐PCR) validation of candidate hop TFs.

Figure S12. Dot plot of the top 10 enriched GO terms for cannabis organ‐specific genes.

Figure S13. Dot plot of the top 10 enriched GO terms for hop organ‐specific genes.

Figure S14. Dot plot of the top 10 enriched GO terms for tomato organ‐specific genes.

Figure S15. Top enriched GO terms for genes differentially expressed in mature tomato leaves compared to young tomato leaves.

Figure S16. The overlap between two sets of GT genes identified using the transcriptome of young and mature tomato leaves.

Figure S17. Heatmap of the expression pattern of photosynthesis‐related genes in tomato.

Figure S18. The conservation level of organ‐specific genes based on HOG analysis.

Figure S19. Motif enrichment analysis of TF target genes.

Figure S20. Species‐specific GT gene regulatory subnetworks.

Figure S21. Shared GT‐specific gene regulatory subnetworks.

Figure S22. DAP‐seq validation using ChIP‐seq data for histone marks.

Figure S23. Hisat2 mapping statistics for RNA‐seq data used to determine organ specificity of genes.

TPJ-125-0-s002.pdf (7.9MB, pdf)

Data Availability Statement

The raw RNA‐seq and DAP‐seq data generated in this study have been deposited in the NCBI Sequence Read Archive (SRA) under the following BioProject accessions: PRJNA884161, PRJNA884162, PRJNA1262533, PRJNA1267741, PRJNA1369742, PRJNA1369628, PRJNA988919 and PRJNA988905. Detailed accession numbers for specific sequences are provided in Tables S49–S51 and S55.

Processed gene‐level count and TPM (transcripts per million) data for all genes across cannabis, hop and tomato are available through Zenodo: https://doi.org/10.5281/zenodo.17667485.

Interactive visualisation of gene expression data are accessible through tissue‐specific eFP browsers for cannabis, hop and tomato at: https://expression.latrobe.edu.au/.

DAP‐seq genome browser tracks are available through JBrowse:


Articles from The Plant Journal are provided here courtesy of Wiley

RESOURCES