Skip to main content
Nature Portfolio logoLink to Nature Portfolio
. 2026 Jan 2;12(1):140–151. doi: 10.1038/s41477-025-02161-z

A novel cis-element enabled bacterial uptake by plant cells

Chloé Cathebras 1,#, Xiaoyun Gong 1,#, Rosa Elena Andrade 1, Ksenia Vondenhoff 1, Jean Keller 2, Pierre-Marc Delaux 2, Makoto Hayashi 3, Maximilian Griesmann 1, Martin Parniske 1,
PMCID: PMC12830364  PMID: 41482520

Abstract

The root nodule symbiosis of plants with nitrogen-fixing bacteria is phylogenetically restricted to a single clade of flowering plants, which calls for as yet unidentified trait acquisitions and genetic changes in the last common ancestor. Here we discovered—within the promoter of the transcription factor gene Nodule Inception (NIN)—a cis-regulatory element (PACE), exclusively present in members of this clade. PACE was essential for restoring infection threads in nin mutants of the legume Lotus japonicus. PACE sequence variants from root nodule symbiosis-competent species appeared functionally equivalent. Evolutionary loss or mutation of PACE is associated with loss of this symbiosis. During the early stages of nodule development, PACE dictates gene expression in a spatially restricted domain containing cortical cells carrying infection threads. Consistent with its expression domain, PACE-driven NIN expression restored the formation of cortical infection threads, also when engineered into the NIN promoter of tomato. Our data pinpoint PACE as a key evolutionary invention that connected NIN to a pre-existing symbiosis signal transduction cascade that governs the intracellular accommodation of arbuscular mycorrhiza fungi and is conserved throughout land plants. This connection enabled bacterial uptake into plant cells via intracellular support structures such as infection threads, a unique and unifying feature of this symbiosis.

Subject terms: Rhizobial symbiosis, Phylogenetics


A key step in the evolution of the nitrogen-fixing root nodule symbiosis, occurring 100 million years ago, subjected the control of Nodule Inception (NIN) gene expression to a protein complex that regulated transcription much earlier in the arbuscular mycorrhiza symbiosis.

Main

Nitrogen is essential for plant growth and development1. A wide phylogenetic variety of land plants ranging from mosses and gymnosperms to angiosperms have evolved symbioses with nitrogen-fixing bacteria that convert atmospheric nitrogen into ammonium2. For example, the fern Azolla maintains colonies of nitrogen-fixing cyanobacteria in specialized apoplastic cavities, outside the plant cell wall enclosure3. A major biological breakthrough was the evolution of the nitrogen-fixing root nodule symbiosis (RNS) characterized by the intracellular accommodation of bacteria in lateral organs (‘nodules’) formed on roots46. The occurrence of the RNS is restricted to a monophyletic clade, encompassing four angiosperm orders: the Fabales, Fagales, Rosales and Cucurbitales (FaFaCuRo)7. Because of this phylogenetic restriction and scattered occurrence of RNS within the FaFaCuRo, Soltis and colleagues7 postulated that the last common ancestor of the FaFaCuRo clade acquired a genetic change, a ‘predisposition’, which enabled members of this clade to subsequently evolve RNS multiple times independently7. The intracellular accommodation of bacteria and root nodule development are two genetically separable and, to this extent, independent features of RNS8,9. It is therefore genetically possible that they did evolve sequentially and not at the same time. The phylogenetic diversity of bacterial symbionts plus the variation of nodule anatomy and development across the RNS-competent FaFaCuRo species10,11 together with the gap of 30 million years between the last common ancestor and the oldest fossil root nodules in this clade12 further fuelled the hypothesis that nodule organogenesis evolved several times independently and was not a feature of the last common ancestor13,14. The recent discovery of multiple losses of RNS within the FaFaCuRo clade15,16 has initiated a discussion about whether this genetic change in the common ancestor was perhaps sufficient for the formation of RNS17. Nonetheless, the precise nature of this key event in the evolution of nodulation has remained a mystery for more than two decades13.

We asked which evolutionary acquisitions by the last common ancestor, in the form of novel traits and the underlying genetic causes, enabled the evolution of the RNS. From a phylogenetic perspective, such acquisitions should be: (1) exclusively present in the FaFaCuRo clade and absent outside of this clade and (2) conserved throughout the FaFaCuRo clade or at least maintained in RNS-competent (hereafter called ‘nodulating’) species. The uptake of bacteria into living plant cells is, with one exception (Gunnera), phylogenetically restricted to the FaFaCuRo clade4. The uptake of bacteria requires the localized lysis of the plant cell wall, which threatens cell integrity because of the turgor pressure imposed by the protoplast6. A systematic comparison of features associated with the RNS across the entire FaFaCuRo clade pinpoints a single unique and shared trait—the uptake of bacteria into living plant cells with intracellular physical support structures—that fulfils both above-mentioned criteria to be acquired by the common ancestor6. These structures come in a diversity of shapes (infection threads (ITs) and infection pegs) and in at least two different cell types (epidermal and cortical) but are all characterized by the apposition of matrix material, which is thought to maintain cell integrity during the localized lysis of the plant cell wall. Although this matrix material is a common feature of all analysed successful bacteria uptake events in FaFaCuRo species, only one type, cortical ITs, can be found in almost all nodulating species6. Cortical IT formation is an evolutionary breakthrough because it allowed clonal selection of bacteria18, specific control of nutrient exchange and increased nitrogen fixation efficiency19. By contrast, in Gunnera, cell integrity is maintained by physical closure of a multicellular cavity by extracellular matrix material20. This difference, together with the phylogenetic distance of Gunnera from the FaFaCuRo clade, suggests an independent origin of bacterial uptake in this genus6. To search for gene gains specific for the FaFaCuRo clade, a genome-wide comparative phylogenomic analysis was performed; however, not a single gene following the aforementioned evolutionary pattern was identified15.

Here, we tested the hypothesis that the ‘predisposition’ event involved gain of novel cis-regulatory elements. Changes in gene regulation can be important drivers of functional and morphological evolution21,22. Emergence or loss of even a single cis-regulatory element can lead to dramatic phenotypic consequences, for example, novel organ formation21,22. Phylogeny has dated the common ancestor of the FaFaCuRo clade to approximately 104 million years ago (Ma)23,24. A long-standing hypothesis states that the evolution of RNS involved co-opting genes from the arbuscular mycorrhiza (AM) symbiosis4,5, which can be traced back to the earliest land plant fossils 410 Ma (refs. 25,26). This hypothesis is underpinned by similarities in intracellular accommodation structures6 and the common requirement of both symbioses for a set of so-called common symbiosis genes5 that are conserved across land plant species able to form AM and encode symbiotic signal transduction and intracellular restructuring machineries2731.

Results

Discovery of PACE

The transcription factor-encoding Nodule Inception (NIN) gene32,33 is positioned at the top of an RNS-specific transcriptional regulatory cascade and is indispensable for RNS32,34,35. The promoter of NIN is a potential physical target for such a co-option event, because it defines the molecular interface between common symbiotic signal transduction and the specific transcriptional networks underlying RNS development35. We therefore compared the NIN-promoter sequences of 37 angiosperm species including 27 FaFaCuRo members and identified only one motif fulfilling the aforementioned criteria, which we called Predisposition Associated cis-regulatory Element (PACE) (Fig. 1, Extended Data Fig. 1a–d and Supplementary Table 1). The phylogenetic distribution of PACE was further investigated in an expanded search comprising 163 plant species in the promoter of NIN and the entire NIN-like protein (NLP) gene family, including NLP1 from which NIN diverged at the base of the eudicots35 (Extended Data Fig. 1e, Supplementary Fig. 1 and Supplementary Table 2). PACE was found in all nodulating FaFaCuRo members and four non-nodulating species that have lost RNS but maintained NIN (Extended Data Fig. 1e and Supplementary Table 3). Importantly, PACE was absent from all the NLP promoters analysed (Supplementary Fig. 1). Thus, within the NIN-like gene family, the phylogenetic distribution of PACE is NIN- and FaFaCuRo-clade specific and is consistent with a model in which PACE was acquired by the NIN promoter of the last common FaFaCuRo ancestor. Intriguingly, the 29-nucleotide-long PACE encompassed and extended beyond the previously identified binding site of the transcription factor Cyclops, which is encoded by a common symbiosis gene required for the development of both AM and RNS27,34 (Fig. 1 and Extended Data Fig. 1).

Fig. 1. Acquisition of PACE was a key step in the evolution of RNS.

Fig. 1

Left: schematic illustration of the phylogenetic relationships between species inside (light-red shade) and outside (light-grey shade) the FaFaCuRo clade and the presence (+) and absence (−) pattern of RNS, NIN and PACE (see Extended Data Fig. 1 and Supplementary Fig. 1 for additional data support). Centre: PACE sequence alignment of the displayed species, in which grey shadings indicate more than 50% sequence identity. On top of the alignment, the PACE consensus sequence is depicted as a position weight matrix calculated from the displayed RNS-competent species. Right: graphical illustration of how PACE connected NIN to symbiotic transcriptional regulation by CCaMK–Cyclops, enabling IT development in the root cortex. This acquisition coincided with the predisposition event. X and Y represent hypothetical proteins binding to sequences flanking the Cyclops binding site.

Extended Data Fig. 1. Discovery of PACE by MEME analyses.

Extended Data Fig. 1

(A - D) Consensus sequence of the Position Weight Matrix identified by MEME analyses using the regions upstream of the translational start site (ATG) thus representing the promoters and 5’UTRs of NIN and NLP genes from 37 angiosperm species (see Methods): (A) in a discriminative search for a motif that is present in 3 kb upstream regions of the NIN genes from nodulating FaFaCuRo species, but absent in 3 kb upstream regions of the NIN genes from species outside of the FaFaCuRo clade and absent in the 3 kb upstream regions of NLP genes; (B) in an independent, non-discriminative search in the 3 kb upstream regions of the NIN genes from nodulating species revealing that the most conserved nucleotides span a larger region than (A); (C) the resulted most conserved 29 nucleotides derived from upstream regions of NIN genes of nodulating FaFaCuRo species; (D) the resulted most conserved 29 nucleotides (PACE) derived from the upstream region of one representative NIN gene per species of nodulating FaFaCuRo species. The Cyclops binding cite (CYC-box34) is highlighted in grey. (E) Motif analyses by FIMO using the upstream regions of the NIN and NLP genes from an expanded list of 163 species (see Methods). Left: pruned tree from the whole NLP tree (demarcated in the black rectangle in Supplementary Fig. 1) corresponding to the NIN orthologs. Right: three versions of consensus sequence (left to right, (A, B and D), respectively) were used to retrieve motifs from the 5 kb upstream region of NIN genes via FIMO search, the output of which is displayed underneath the consensus sequences. Note that these identified motifs can originate from the + or – strand. The sequences displayed are those with the lowest q-value identified by FIMO and this resulted in some cases in overlapping sequences that originated from opposite strands (see Supplementary Table 2). Note that certain nodulating species have multiple copies of the NIN gene in their genome, for example Phaseolus vulgaris and Glycine max. We identified PACE in at least one of the NIN promoters in each of these species. Sequence names of nodulating species are coloured in blue. Blank lines represent the absence of significant motifs.

Given this clade-specific distribution of PACE, we searched for conserved motifs in the promoter sequences of two genes encoding transcriptional regulators, ERF Required for Nodulation 1 (ERN1)36 (Supplementary Fig. 2) and Reduced Arbuscular Mycorrhiza 1 (RAM1)37 (Supplementary Fig. 3) that are also known Cyclops targets. We identified motifs within the promoters of both, ERN1 and RAM1, encompassing the previously identified Cyclops binding sites36,37. In sharp contrast to PACE, their presence extended beyond the FaFaCuRo clade (Supplementary Figs. 2 and 3).

We tested the functional relevance of these distinct phylogenetic distribution patterns in transcriptional activation assays in Nicotiana benthamiana leaf cells. Transactivation by Cyclops was restricted to NIN promoters from FaFaCuRo species but extended to non-FaFaCuRo species for RAM1 promoters (Extended Data Fig. 2 and Supplementary Fig. 4). Importantly, PACE was necessary and sufficient for the activation of the NIN promoter by Cyclops (Extended Data Fig. 3). Together with the exclusive occurrence of PACE in the NIN promoter of the FaFaCuRo clade, these results are in line with the hypothesis that the mechanistic link between Cyclops and the NIN promoter was established in the last common ancestor of this clade (Fig. 1).

Extended Data Fig. 2. Transcriptional activation of NIN promoter:Firefly luciferase reporter gene by CCaMK1–314/Cyclops is restricted to NIN promoters from species of the FaFaCuRo clade.

Extended Data Fig. 2

Nicotiana benthamiana leaf cells were transformed with T-DNAs carrying a Firefly luciferase reporter gene driven by either of the indicated promoters in tandem with the AtACT2pro:Renilla luciferase reporter fusion that provides a quantitative internal standard. (A) List of species within the FaFaCuRo clade (light red shade) and outside (light grey shade) and abbreviations. (B) Reporter gene activation by L. japonicus CCaMK1–314/Cyclops via NIN promoters (NINpro) originating from listed species. (C) Comparison of the transactivation potential of Cyclops versions from L. japonicus and S. lycopersicum. Note that the expression of the Firefly luciferase reporter gene driven by LjNINpro, the RAM1 promoters from L. japonicus and S. lycopersicum (LjRAM1pro and SlRAM1pro, respectively) was induced in the presence of CCaMK1–314/Cyclops regardless of the origin of Cyclops. In contrast, the transactivation failed with the SlNIN promoter (panel (A)). Boxplots display the ratio of the Firefly/Renilla luciferase signals. Each dot represents one N. benthamiana leaf disc. Thick black lines, median; box, interquartile range; whiskers, lowest and highest data point within 1.5 interquartile range (IQR); black filled circles, data points inside 1.5 IQR; white filled circles, data points outside 1.5 IQR of the upper/lower quartile. The applied statistical method was ANOVA with post hoc Tukey: (B), F14,214 = 71.07, p< 2×10−16; (C), plots from left to right: F5,18 = 20.58, p = 7.14×10−7; F5,18 = 25.38, p = 1.45×10−7 and F5,18 = 40.49, p = 3.55×10−9, respectively. Different small letters indicate significant differences. Data displayed are from one experiment. Each combination of constructs was tested two times independently with similar outcomes.

Extended Data Fig. 3. PACE sequence variants from species across the FaFaCuRo clade were able to functionally replace L. japonicus PACE in a LjNINpro:GUS reporter fusion.

Extended Data Fig. 3

N. benthamiana leaf cells were transformed with T-DNAs carrying a GUS reporter gene driven by either of the indicated promoters: (A) the L. japonicus NIN promoter (NINpro), the LjNIN promoter with PACE mutated or deleted (NINpro::mPACE and NINpro::∆PACE, respectively), or PACE sequence variants from the nodulating FaFaCuRo species fused to the LjNIN minimal promoter (NINminpro); (B) chimeric promoters where LjPACE in the LjNIN promoter was replaced with either one of the PACE variants from species tested in (A) or from non-nodulating FaFaCuRo species including the Juglans regia PACE-like motif (JrPACE-like); (C) the S. lycopersicum NIN promoter (SlNINpro), the SlNIN promoter with LjPACE (SlNINpro::PACE) or mPACE (SlNINpro::mPACE) inserted. For species abbreviations see Extended Data Fig. 2a. Note in (A) that the deletion or mutation of PACE in LjNIN promoter resulted in a drastic reduction in reporter gene expression and in (C) insertion of LjPACE but not mPACE into the S. lycopersicum promoter confers transactivation by CCaMK1−314/Cyclops. GUS activities are displayed as individual dots in box plots. Each dot represents one N. benthamiana leaf disc. Thick black lines, median; box, interquartile range; whiskers, lowest and highest data point within 1.5 interquartile range (IQR); black filled circles, data points inside 1.5 IQR; white filled circles, data points outside 1.5 IQR of the upper/lower quartile. The applied statistical method was ANOVA with post hoc Tukey: (A) F20,144 = 51.38, p < 2×10−16; (B), F18,166 = 149.1, p < 2×10−16; (C) F7,62 = 30.5, p = 7.02×10−7. Different small letters indicate significant difference. n.d., not determined. Data displayed are from one experiment. Each combination of constructs was tested two times independently with similar outcomes.

PACE drives the expression of NIN during IT development in the cortex

NIN is indispensable for IT development32,33 and its precise spatiotemporal expression is essential for this process33,3840. Because cis-regulatory elements are master determinants of gene expression patterns41, we investigated the effect of PACE on the expression of NIN in physical relation to the bacterial uptake and accommodation stages during nodule development. We used the model legume Lotus japonicus in combination with its compatible nitrogen-fixing bacterium Mesorhizobium loti as experimental system. The process by which L. japonicus promotes the intracellular colonization by and accommodation of M. loti can be subdivided into successive stages: (1) entrapment of bacteria in a pocket formed by a curled root hair42, (2) uptake of bacteria into a developing IT within that root hair42, (3) IT progression into and through the outer cortical cell layers43, (4) IT branching and extension within the nodule primordium44 and (5) release of bacteria from ITs into plant membrane-enclosed organelle-like structures called symbiosomes44 leading to (6) mature nodules characterized by infected cells densely packed with symbiosomes and the pink colour of leghemoglobin45.

To determine the PACE-mediated spatiotemporal expression domain, we introduced a GUS reporter gene driven by PACE fused to a region comprising the NIN minimal promoter and the 5′ untranslation regions (UTR)34 (PACE:NINminpro:GUS) into L. japonicus wild-type roots. The roots were subsequently inoculated with M. loti MAFF 303099 expressing DsRed (M. loti DsRed) facilitating detection of the bacteria through their fluorescence signal in root hairs and nodules. The NIN minimal promoter did not mediate reporter gene expression at any stage of bacterial infection (Extended Data Fig. 4e). Intriguingly, the earliest detectable GUS activity mediated by PACE:NINminpro:GUS was clearly restricted to a zone in the nodule primordia (Extended Data Fig. 4d(I,II)) that roughly correlated with the site of bacterial infection (indicated by a local accumulation of DsRed signal) and later expanded to the entire central tissue of the nodule (Extended Data Fig. 4d(III)). PACE-driven reporter expression was neither detected in root hairs harbouring ITs (Extended Data Fig. 4g) nor in nodules in which cells from the central tissue were filled with symbiosomes (Extended Data Fig. 4d(IV)). Importantly, PACE-mediated expression was distinct from that mediated by the LjNIN 3 kb promoter (NINpro) or the NINpro with PACE mutated or deleted (NINpro::mPACE and NINpro::∆PACE, respectively) that conferred reporter expression across the central tissue of the nodule (Extended Data Fig. 4a–c(II–IV)). We concluded on the basis of these observations that the PACE-mediated expression domain is temporally and spatially restricted and possibly accompanies the development of bacterial accommodation structures in the nodule.

Extended Data Fig. 4. Spatio-temporal GUS expression driven by PACE and the NIN promoter in L. japonicus roots during the bacterial infection process.

Extended Data Fig. 4

L. japonicus wild-type hairy roots were transformed with T-DNAs carrying a Ubq10pro:NLS-GFP transformation marker together with a GUS reporter gene driven by either of the indicated promoters: (A) the 3 kb LjNIN promoter (NINpro); the LjNIN promoter with PACE (B) mutated (LjNINpro::mPACE) or (C) deleted (NINpro::ΔPACE); (D) PACE fused to the LjNIN minimal promoter (PACE:NINminpro) or (E) the LjNIN minimal promoter (NINminpro). The progression of bacterial infection was determined by the DsRed signal 10 - 14 days post inoculation (dpi) with M. loti DsRed. Nodules undergoing different stages of infection (panels I to IV) were stained with X-Gluc to reveal the GUS expression pattern. Note the overlapping bacterial invasion zone and PACE:NINminpro:GUS expression in early infection stages (red and blue arrowheads in (D)) as well as the differences between PACE:NINminpro:GUS and the much broader NINpro:GUS expression at that stage (red and blue arrows in (A)). Red arrow and arrowheads: M. loti DsRed. Blue arrow and arrowheads: GUS activity in root hairs bearing ITs and nodule primordia, respectively. The NINminpro:GUS fusion gave only rarely detectable signal, and if so in the vasculature (yellow arrowhead in (E)). Only pictures taken under white light illumination (WLI) are displayed for nodules in panel VI to reveal the pink colour of leghemoglobin, characteristic for mature and fully infected nodules. Note that PACE:NINminpro:GUS expression was absent at this stage, whereas the NINpro:GUS resulted in strong blue staining in the nodule regardless of the presence of PACE (compare panel IV in (D) and (A - C)). (F) Quantification of transgenic root systems exhibiting GUS expression in different cell types and tissues exemplarily displayed in (A - E). (G) PACE drove GUS reporter gene expression in the central tissue of primordia and nodules, but was not sufficient for expression in root hairs. Transgenic roots carrying promoter:GUS fusions same as in (A, D and E) were inoculated with M. loti lacZ and dual-stained with X-Gluc and Magenta-Gal. Purple: M. loti lacZ. Blue: GUS activity. Note the co-existence of blue and purple staining in root hairs on roots transformed by NINpro:GUS, but not that transformed by PACE:NINminpro:GUS. Data displayed in (F) are combined from three independent experiments. Bars, 250 μm.

To further resolve this relationship between PACE-driven gene expression and bacterial accommodation at the cellular level, we compared—simultaneously in the same tissue—the progression of bacterial infection with the expression pattern mediated by PACE fused to the NIN minimal promoter (PACE:NINminpro) and by a NIN promoter with mutated PACE (NINpro::mPACE). A red and a yellow fluorescent protein (mCherry and YFP, respectively) targeted to the nucleus by fusion to a nuclear localization signal (NLS) were used as reporters. The resulting promoter:reporter fusions (PACE:NINminpro:NLS-mCherry and NINpro::mPACE:NLS-YFP) were placed in tandem on the same transfer-DNA (T-DNA) allowing a nucleus-by-nucleus comparison of their relative expression. This T-DNA construct was introduced into L. japonicus wild-type roots that were subsequently inoculated with M. loti R7A expressing the cyan fluorescent protein (CFP; Fig. 2) or with M. loti MAFF 303099 expressing the green fluorescent protein (GFP; Supplementary Fig. 5) to facilitate detection.

Fig. 2. PACE drives the expression of NIN during IT development in the cortex.

Fig. 2

a,b, Sections of representative L. japonicus nodule primordia formed upon inoculation with M. loti R7A expressing CFP (blue) imaged by confocal laser-scanning microscopy; a comparison of the expression domains determined by PACE (PACE:NINminpro:NLS-mCherry; red) and a NIN promoter carrying a mutated PACE (NINpro::mPACE:NLS-YFP; green) (a) or PACE (red) and the intact NIN promoter (NINpro:NLS-YFP; green) (b). The dashed lines demarcate a group of cortical cells in the PACE core territory. The arrowheads indicate ITs. Numbers correspond to nodule primordia showing the presented expression pattern/total number of nodule primordia sectioned and inspected. The data are from four independent experiments (see Supplementary Fig. 5 for the first stages of bacterial invasion (stages 2 to 3)). Scale bars, 20 µm. c, Graphical interpretation of the expression patterns presented in a and b. Yellow, overlapping region.

During the first stages of bacterial invasion (stages 2 to 3), PACE-mediated mCherry was expressed specifically in cortical cells carrying ITs and in directly adjacent cells (Supplementary Fig. 5a). By contrast, the NINpro::mPACE-driven YFP signal was not detected in those cells (Supplementary Fig. 5b). In sections of developing nodules, in which infection had progressed to stage 3 or 4, PACE-mediated mCherry was expressed specifically in a—hereafter called ‘IT zone’ – comprising cortical cells and primordium cells that carried ITs and in some, but not all, directly adjacent cells46 (25 out of 29 nodules inspected; Fig. 2a). Intriguingly, the expression domains marked by mCherry and YFP fluorescence were distinct from each other: whereas the PACE-driven mCherry signal was consistently marking the IT zone, the NINpro::mPACE-driven YFP signal was observed in primordium cells surrounding this zone (16 out of 18 nodules inspected; Fig. 2a,c). The thin (approximately 1–2-cell-thick) border between the two domains was characterized by nuclei emitting both YFP and mCherry signals (Fig. 2a). In so-marked cells, ITs were typically not detected. The expression pattern mediated by the NIN promoter (containing PACE) was congruent with the sum of both promoter fragments (8 out of 8 nodules inspected; Fig. 2b,c).

On the basis of these clearly distinct and complementary reporter expression domains governed by PACE versus the remaining promoter, we concluded that (1) PACE directs NIN expression to a specific IT zone and that (2) the NIN promoter comprises cis-regulatory elements that drive expression outside the PACE territory that is in root hairs (together with PACE), non-infected cortical and primordium cells and nodule cells filled with symbiosomes. These additional cis-regulatory elements might be addressed by other transcription factors that have been reported to bind to this promoter4749. These transcription factors might be counteracted by, for example, repression in the IT zone.

Mutational dissection of PACE reveals a quantitative effect of sequences flanking the CYC-box on IT development

To test the relevance and specific role of PACE in nodule and IT development, we performed complementation experiments using plants homozygous for the nin-2 or nin-15 mutant alleles32. The nin-2 mutant allele harbours a frameshift mutation of the NIN gene, leading to a NIN loss-of-function phenotype, which is absence of both IT formation and nodule organogenesis32, whereas the nin-15 mutant allele carries a Lotus Retrotransposon 1 insertion within the NIN promoter 143 bp 3′ of PACE (Extended Data Fig. 5). We examined the restoration of bacterial infection 21 days post inoculation (dpi) with M. loti DsRed by quantifying the number of root hairs harbouring ITs and the number of infected nodules (Fig. 3 and Supplementary Table 4).

Extended Data Fig. 5. L. japonicus nin-15 mutant phenotype.

Extended Data Fig. 5

(A) A representative picture of L. japonicus wild-type (WT, left) and nin-15 (right) plants 21 dpi with M. loti DsRed. (B) Position of the Lotus Retrotransposon 1 (LORE1) insertion within the NIN promoter in the nin-15 mutant. (C) Representative pictures of nin-15 root hairs and nodule sections 21 dpi with M. loti DsRed. Forty-nine plants with a total number of 436 nodules were analysed: only four plants bore one or two IT(s) within root hairs and seven plants bore one or two infected nodule(s). Deformed or curled root hairs in the presence of M. loti DsRed were abundant but infection threads were rarely found. Arrowheads: uninfected nodules. Unlabelled bars, 100 μm. (D – E) Phenotype of nin-15 in the presence of a symbiosis-independent nitrogen source (15 mM KNO3) for 28 days. (D) Pictures documenting the healthy status of L. japonicus WT and nin-15 plants (compare (D) and (A)) and (E) quantitative assessment of parameters displayed in boxplots. Thick black lines, median; box, interquartile range; whiskers, lowest and highest data point within 1.5 interquartile range (IQR); black filled circles, data points inside 1.5 IQR; white filled circles, data points outside 1.5 IQR of the upper/lower quartile. Each dot represents one plant. n: number of plants analysed. Lateral root density: number of lateral roots/primary root length (cm). The applied statistical method was two-tailed Welch’s t-test. (F) Segregation analysis of nin-15 assessed by quantifying the number of infected nodules. Each dot in the boxplots represents one plant. n: number of plants analysed. The applied statistical method was ANOVA with post hoc Tukey: F3,120 = 84.1, p = 2×10−16. Different small letters indicate significant difference. (G) Representative pictures of nin-15 plants with hairy roots transformed with the NIN gene driven by the L. japonicus NIN minimal promoter (NINminpro) or the 3 kb NIN promoter (NINpro) 24 dpi with M. loti DsRed. Data are from a single experiment. WLI: white light illumination.

Fig. 3. PACE is necessary for bacterial infection and functionally conserved across the FaFaCuRo clade.

Fig. 3

ad, Microscopy images of representative nodule sections or root hairs harbouring an IT or an infection pocket from nin-2 (a) or nin-15 roots (bd) transformed with the LjNIN gene driven by the indicated promoters (a,b) and the L. japonicus NIN promoter in which LjPACE was replaced by PACE from nodulating (c) and non-nodulating (d) FaFaCuRo species, or with a PACE-like sequence identified in the JrNLP1b promoter. The percentage values of transgenic root systems carrying infected nodules or root hair ITs are indicated. At least five nodules from independent transgenic root systems were sectioned per construct. The percentage of root hair ITs among the total infection events per root pieces and the number of infected nodules per transgenic root system are displayed in Extended Data Figs. 69. Scale bars, 100 µm. BF, brightfield; Avg., average number of infected nodules on plants carrying infected nodules; n.a., not applicable.

Nodule development in the legume Medicago truncatula is dependent on NIN expression mediated by a regulatory region containing several putative cytokinin responsive elements (CE)40. In L. japonicus, a similar CE region is positioned 45 kb upstream of the NIN transcriptional start site40. To enable transgenic complementation experiments, we synthetically fused a 1 kb or 5 kb region encompassing this distant CE to the 5′ end of a 3-kb NIN promoter. The NIN gene driven by these promoters (CE1kb:NINpro:NIN and CE5kb:NINpro:NIN) restored the formation of root hair ITs on 78% and 95% and infected nodules on 40% and 88% of nin-2 transgenic root systems, respectively (Fig. 3a, Extended Data Figs. 68 and Supplementary Figs. 6 and 7). Importantly, this complementation success relied on the presence of PACE. nin-2 roots transformed with the same fusion design but carrying a mutation of PACE (CE1kb:NINpro::mPACE:NIN and CE5kb:NINpro::mPACE:NIN) did not restore root hair ITs; however, nodule formation was not impaired when using the cytokinin element-containing region of 5 kb (CE5kb:NINpro::mPACE:NIN). We concluded that PACE is indispensable for bacterial infection but not for nodule development.

Extended Data Fig. 6. The CYC-box and flanking sequences of PACE are required for the complete restoration of the bacterial infection process in the L. japonicus nin-2 mutant.

Extended Data Fig. 6

nin-2 roots were transformed with T-DNAs carrying a Ubq10pro:NLS-GFP transformation marker in tandem with the LjNIN gene driven by either of the following promoter versions: the cytokinin element-containing region of 1 kb (CE1kb) fused to the 3 kb or 9 kb LjNIN promoter (CE1kb:NINpro or CE1kb:NIN9kbpro, respectively); CE1kb:NINpro or CE1kb:NIN9kbpro with PACE mutated (CE1kb:NINpro::mPACE or CE1kb:NIN9kbpro::mPACE, respectively); CE1kb:NINpro carrying a mutated Cyclops binding site (CYC-box) (CE1kb:NINpro::mbox); CE1kb:NINpro carrying mutated sequences flanking the CYC-box in PACE (CE1kb:NINpro::mflanking); CE1kb fused to the LjNIN minimal promoter (CE1kb:NINminpro); CE1kb fused to PACE and to NINminpro (CE1kb:PACE:NINminpro); NINpro, PACE:NINminpro or NINminpro. (A) Representative overview pictures of transgenic root systems. Roots were analysed 21 dpi with M. loti DsRed. White asterisks and arrowheads: infected and non-infected nodules, respectively. Bars, 2 mm. (B – C) Boxplots displaying the number of root hair ITs or infected nodules and the percentage of root hair ITs among total infection events (sum of bacterial entrapments and ITs). Each dot represents one transgenic nin-2 root system or root piece. L. japonicus WT roots transformed with NINpro:NIN or CE1kb:NINpro:NIN were included as controls. Note the loss of restoration of nodules and IT formation associated with the mutation of PACE or only the CYC-box in PACE; and the reduction of same when sequences flanking the CYC-box in PACE were mutated. n: number of transgenic root systems or root pieces analysed. Thick black lines, median; box, interquartile range; whiskers, lowest and highest data point within 1.5 interquartile range (IQR); black filled circles, data points inside 1.5 IQR; white filled circles, data points outside 1.5 IQR of the upper/lower quartile. Numbers above the boxplots: the value of individual data points outside of the plotting area. Data are from a single experiment. n.d.: not determined. WLI: white light illumination.

Extended Data Fig. 8. The CYC-box and flanking sequences of PACE are required for the complete restoration of the bacterial infection process but are dispensable for the nodule organogenesis process in the L. japonicus nin-2 mutant.

Extended Data Fig. 8

Pictures of nodule sections or roots from L. japonicus nin-2 roots 21 dpi with M. loti DsRed from the same experiments depicted in Extended Data Fig. 6 (A) and Extended Data Fig. 7 (B). Nodule sections from L. japonicus WT roots transformed with NINpro:NIN and CE5kbNINpro:NIN were included for comparison. Note that when the cytokinin element-containing region of 1 kb was fused to NINpro nodule organogenesis was abolished by mutation of PACE or only the CYC-box in PACE and that these mutations did not abolish organogenesis when the cytokinin element-containing region of 5 kb was fused to NINpro. At least five nodules from independent transgenic root systems were sectioned per construct. Bars, 100 μm.

The 29-bp-long PACE sequence revealed by MEME encompasses and extends beyond the previously identified Cyclops binding site (CYC-box34, ‘box’; Extended Data Fig. 1). Its degree of conservation may be interpreted as a trace of an ancestral PACE version present in the last common ancestor of the FaFaCuRo clade. Within PACE, the CYC-box is surrounded by less conserved flanking sequences. To dissect the specific contributions of the CYC-box and PACE sequences flanking the CYC-box (‘flanking’) to PACE function, we mutated the box and the flanking sequences independently (CE:NINpro::mbox:NIN and CE:NINpro::mflanking:NIN, respectively). Mutation of the CYC-box abolished root hair ITs. Interestingly, mutation of the flanking sequences led to a 50% reduction of the number of transgenic root systems carrying infected nodules, whereas the formation of root hair ITs was not impaired (Extended Data Figs. 68 and Supplementary Figs. 6and 7). This mutational dissection revealed two separable functions of PACE: whereas the PACE–Cyclops connection is essential for IT development, the flanking sequences significantly promote bacterial infection during nodule development and possibly act as binding sites for additional, yet undefined, transcription factors (conceptually labelled X and Y in Fig. 1). Our data suggest that PACE comprises synergistic binding sites for both Cyclops and cooperating transcription factors. We conclude that the high level of conservation of the CYC-box is a consequence of the indispensable nature of this cis-element for the progression of the IT through the cortex. The higher level of diversification of sequences flanking the CYC-box might be a consequence of changes in transcription factors occupancy over evolutionary time scales. Considering this scenario, it is possible that such flanking sequence-occupying transcription factors are not conserved throughout the entire FaFaCuRo clade.

PACE-mediated NIN expression defined an infection zone in the nodule cortex (Fig. 2). To genetically separate the initiation of nodule development from IT formation and thereby enable a focused analysis of the role of PACE in cortical IT formation, we utilized the nin-15 mutant, which is impaired in IT formation but retains the capacity to form nodules. Most of these nodules were uninfected (92% and 86% plants carrying no root hair ITs and no infected nodules, respectively), and cortical cells filled with symbiosomes were never observed (Extended Data Fig. 5). This mutant therefore provided an ideal background to study the role of PACE in cortical IT formation, circumventing the negative epistatic effect of the inability of nin loss-of-function mutants to initiate cell divisions32,3840,50 (Figs. 3b–d and 4, Extended Data Figs. 9 and 10 and Supplementary Table 4).

Fig. 4. PACE enables IT formation in the cortex.

Fig. 4

a,b, Representative pictures of nin-15 root hairs, root and nodule sections (see Extended Data Fig. 10 for overview pictures) transformed with the L. japonicus NIN gene driven by NINminpro or PACE:NINminpro(a) or SlNINpro and SlNINpro with LjPACE or mPACE inserted (b). The percentage values of transgenic root systems carrying root hair ITs or infected nodules are indicated. Ratios indicate the number of nodules showing the presented pattern/total number of nodules sectioned and inspected. The data in a are from two independent experiments (see Extended Data Fig. 10 for second replicate), and the data in b are from a single experiment. c, Box plots displaying the percentage of root hair ITs and infected nodules per transgenic root system. The thick white lines represent the median; the box represents the IQR; whiskers represent the lowest and highest data point within 1.5× IQR; black-filled circles represent the data points inside 1.5× IQR; white-filled circles represent the data points outside 1.5× IQR of the upper quartile. The statistical method applied was the two-tailed Fisher’s exact test. The data in c are from a and b. Unlabelled scale bars, 100 µm. n, number of transgenic root systems or root pieces analysed. BF, brightfield.

Extended Data Fig. 9. PACEs from FaFaCuRo species are functionally equivalent in restoring bacterial infection in the L. japonicus nin-15 mutant.

Extended Data Fig. 9

L. japonicus nin-15 roots were transformed with T-DNAs carrying a Ubq10pro:NLS-GFP transformation marker in tandem with the LjNIN gene driven by either of the following promoters: (A) the 3 kb LjNIN promoter (NINpro), the LjNIN minimal promoter (NINminpro), the 3 kb LjNIN promoter with PACE deleted (NINpro::∆PACE) or mutated (NINpro::mPACE); (B) the 3 kb LjNIN promoter with LjPACE replaced with either of the PACE sequence variants from nodulating or non-nodulating FaFaCuRo species and analysed 21 dpi with M. loti DsRed. (A – B) Representative overview pictures of nin-15 transgenic roots systems. Sections of representative nodules are displayed in Fig. 3. Note the drastic reduction of restoration of infection in nodules and root hairs associated with the mutation or deletion of PACE as well as the replacement of PACE with JrPACE-like in the context of the LjNIN promoter. White asterisks and arrowheads: infected and non-infected nodules, respectively. (C – E) Boxplots displaying (C) the percentage of root hair ITs among total infection events (sum of bacterial entrapments and ITs) and (D – E) the number of infected nodules from two independent experiments. Each dot represents one nin-15 transgenic root piece (C) or root system (D – E). (C) displays merged data from experiments in (D – E) as the percentage represents a normalised value calculated for each root piece (see Supplementary Table 4). Thick black lines, median; box, interquartile range; whiskers, lowest and highest data point within 1.5 interquartile range (IQR); black filled circles, data points inside 1.5 IQR; white filled circles, data points outside 1.5 IQR of the upper/lower quartile. n: number of transgenic root systems or root pieces analysed. For species abbreviations see Extended Data Fig. 2a. The applied statistical method was ANOVA with post hoc Tukey: (C) F9,313 = 106.7, p < 2×10−16; (D) F6,346 = 82.89, p < 2×10−16; (E) F4,135 = 20.18, p = 4.76×10−13. Different small letters indicate significant differences. Data are from a single experiment. Bars, 2 mm. WLI: white light illumination.

Extended Data Fig. 10. PACE alone or in the context of the S. lycopersicum NIN promoter (a species outside of the FaFaCuRo clade) enables IT formation in the cortex.

Extended Data Fig. 10

(A – D) Representative pictures of sections of nodules formed on L. japonicus nin-15 roots transformed with T-DNAs carrying a Ubq10pro:NLS-GFP transformation marker together with the LjNIN gene driven by either of the following promoters: (A – B) the L. japonicus NIN minimal promoter (NINminpro) or PACE fused to NINminpro (PACE:NINminpro); (C – D) the 3 kb S. lycopersicum NIN promoter (SlNINpro), the 3 kb SlNIN promoter with mutated PACE (SlNINpro::mPACE) or with L. japonicus PACE inserted (SlNINpro::PACE), 21 dpi with M. loti DsRed (from the same experiments depicted in Fig. 4). Black rectangles in (A) demarcate the enlarged area displayed in Figs. 4a and 4b to focus on the initial infection structures. Note the absence of cells filled with symbiosomes in nodules transformed with the LjNIN gene driven by PACE:NINminpro or NINminpro. By contrast, infected cells were often filled with symbiosomes in the SlNINpro::PACE:NIN-transformed nodules, like those resulted by NINpro:NIN (see (C) and compare the two sections in (D)). (E – F) Boxplots displaying the percentage of root hair ITs among total infection events (sum of bacterial entrapments and ITs) or the percentage of infected nodules among total number of nodules (E) 21 dpi and (F) 35 dpi with M. loti DsRed, respectively. Each dot represents one nin-15 transgenic root system or root piece. (E) displays results from an independent repetition from the experiment depicted in Fig. 4. n: number of transgenic root systems or root pieces analysed. Thick white (E) and black (F) lines, median; box, interquartile range; whiskers, lowest and highest data point within 1.5 interquartile range (IQR); black filled circles, data points inside 1.5 IQR; white filled circles, data points outside 1.5 IQR of the upper/lower quartile. Data displayed are from a single experiment. Numbers above the boxplots: the value of individual data points outside of the plotting area. The applied statistical method was two-tailed Fisher’s exact test. Bars, (A and C) 100 μm; (B and D) 50 μm.

The transformation with the L. japonicus NIN gene driven by the NIN minimal promoter (NINminpro:NIN) did not alter the symbiotic phenotype of nin-15 roots (Fig. 3b). By contrast, the NIN gene driven by the NIN promoter (NINpro:NIN) led to restoration of the complete infection process in nin-15 roots from root hair ITs to symbiosome formation (100% and 92% of transgenic root systems carried root hairs ITs and infected nodules, respectively; Fig. 3b). Similar to observations in complementation experiments of nin-2, mutation or deletion of PACE (NINpro::mPACE:NIN and NINpro::∆PACE:NIN, respectively) drastically reduced the restoration of bacterial infection in root hairs and nodules in nin-15 (Fig. 3b, Extended Data Figs. 69 and Supplementary Figs. 6 and 7).

PACEs from different nodulating FaFaCuRo species are functionally equivalent

PACE was detected by MEME searches as a conserved motif within NIN promoters of the FaFaCuRo clade. However, the individual PACE sequences from different species differed from each other, mostly so in the sequences flanking the CYC-box (Fig. 1 and Extended Data Fig. 1). We therefore tested whether and to what extend this sequence variation of PACE would affect its function. The replacement of PACE within the L. japonicus (Fabales) 3-kb NIN promoter with PACE sequence variants (NINpro::Species abbreviation PACE:NIN) originating from Casuarina glauca (Fagales), Datisca glomerata (Cucurbitales) or Dryas drummondii (Rosales) restored the complete infection process in nin-15 to similar level as NINpro:NIN, demonstrating the functional conservation of PACE from nodulating species across the entire FaFaCuRo clade (Fig. 3c and Extended Data Fig. 9). Similarly, the PACE versions from two non-nodulating Rosales that maintained the NIN gene, Ziziphus jujuba and Prunus persica, restored the complete infection process in nin-15 (Fig. 3d and Extended Data Fig. 9). The results of these complementation experiments were consistent with the conserved expression pattern mediated by PACEs in L. japonicus (Supplementary Fig. 8) and the CCaMK–Cyclops-mediated transactivation via these PACE variants (Extended Data Fig. 3a) or chimeric promoter:reporter fusions (Extended Data Fig. 3b) tested in N. benthamiana leaves.

Loss of PACE is associated with a loss of the nitrogen-fixing RNS

Griesmann et al.15 and van Velzen et al.16 discovered that RNS was lost multiple times independently during evolution via independent truncations or losses of the NIN gene. However, at least 10 out of 28 FaFaCuRo species that lost RNS have maintained a full-length NIN open reading frame (Supplementary Table 3). On the basis of our complementation data, PACE is indispensable for the NIN promoter function in symbiosis (Fig. 3, Extended Data Figs. 69 and Supplementary Figs. 6 and 7). Therefore, the absence of PACE from five out of these ten species (Supplementary Tables 13), is potentially sufficient to explain these losses of RNS. Consequently, at least 82% of all losses can now be attributed to either the NIN ORF (18 out of 28, 64%) or loss of PACE (5 out of 28, 18%) (Supplementary Table 3). The presence of PACE in all nodulating species (31 out of 31, 100%; Supplementary Tables 1 and 2) together with a correlation between the absence of PACE with the absence of RNS adds strong support for the evolutionary relevance of PACE both in the gain and potential loss of RNS.

PACE was not detected in the promoters of NLP genes (Supplementary Fig. 1 and Supplementary Tables 1 and 2) with the possible exception of the curious case of Juglans regia (Fagales). Although it was also absent from the promoter of the so-annotated NIN gene, a PACE-like motif was identified in the promoter of the closest gene family member, NLP1 JrNLP1b (JrPACE-like; Supplementary Table 1). This PACE-like element was not able to restore IT formation in nin-15 (Fig. 3d and Extended Data Fig. 9). Regardless of whether this exceptional presence/absence pattern of PACE may be caused by a miss-annotation of NIN and NLP1 in J. regia, either a loss-of-function mutation within PACE or a loss of the entire PACE element in the JrNIN promoter could explain the absence of the RNS observed in this species.

PACE is sufficient to restore cortical IT formation in nin-15

We tested whether PACE on its own, only supported by the minimal NIN promoter (PACE:NINminpro), is sufficient to restore IT development in cortical cells. For this purpose, we transformed nin-15 roots with PACE:NINminpro fused to the transcribed region of the NIN gene. PACE-mediated NIN expression led to an increased success in restoration of infection (49% of transgenic root systems carried infected nodules) compared with NINminpro:NIN-transformed roots (17%; Fig. 4a, Extended Data Fig. 10 and Supplementary Table 4). Root hair ITs were rarely observed on PACE:NINminpro:NIN-transformed nin-15 roots (Fig. 4a), and nodules harbouring cells filled with symbiosomes were not found (Extended Data Fig. 10), consistent with the restricted expression domain defined by PACE (Extended Data Fig. 4).

Strikingly, the vast majority of infected nodules transformed with PACE:NINminpro:NIN (25 out of 28 nodules inspected) carried ITs in the outer cortex, originating from a focused hyperaccumulation of bacteria, locally constricted by root cell wall boundaries (Fig. 4a). This phenomenon was not observed in most of the rarely occurring infected nodules formed on NINminpro:NIN-transformed nin-15 roots (11 out of the 16 nodules inspected did not carry ITs in the outer cortex). Bacterial colonies within cell wall boundaries resembling this phenomenon have been described in a variety of legumes including Sesbania and Mimosa51,52. Our data imply that PACE promotes this type of cortical IT initiation. Altogether, these findings revealed that PACE promotes IT development in cortex cells but not within root hairs.

PACE insertion into the tomato NIN promoter confers RNS capability

To artificially recapitulate the functional consequence of PACE acquisition into a non-FaFaCuRo NIN promoter, we chose tomato (Solanum lycopersicum) which belongs to the Solanaceae, a family phylogenetically distant from the FaFaCuRo clade. Consistent with the absence of PACE, a GUS reporter gene driven by the tomato NIN promoter (S. lycopersicum NIN promoter (SlNINpro)) was not transactivated by Cyclops in N. benthamiana leaf cells (Fig. 1 and Extended Data Figs. 2b and 3c), whereas the insertion of the L. japonicus PACE (SlNINpro::PACE), but not of a mutated PACE (SlNINpro::mPACE), conferred transactivation by Cyclops (Extended Data Fig. 3c).

We tested the ability of the LjNIN expressed under the control of these synthetic promoters to restore the bacterial infection process in nin-15. Similar to NINminpro:NIN-transformed nin-15 roots, SlNINpro:NIN did not restore bacterial infection (0% and 7% of transgenic root systems carried root hair ITs and infected nodules, respectively) (Fig. 4a–c). By contrast, nin-15 roots transformed with SlNINpro::PACE:NIN restored the formation of root hair ITs and infected nodules on 36% and 26% of transgenic root systems, respectively (Fig. 4b and Extended Data Fig. 10). This increase in infection success was not observed on SlNINpro::mPACE:NIN-transformed roots. ITs in the outer cortex that originated from a focal accumulation of bacteria were also observed in the SlNINpro::PACE:NIN-transformed nin-15 nodules (8 out of 14 nodules inspected; Fig. 4b) resembling those in the PACE:NINminpro:NIN-transformed nin-15 nodules (Fig. 4a). The gained ability of the SlNIN::PACE promoter to restore root hair ITs suggested that additional cis-regulatory elements within the SlNIN promoter function together with PACE for root hair IT formation. All together, these findings obtained with the tomato NIN promoter carrying an artificially inserted PACE agree with the hypothesis that the acquisition of PACE by a non-FaFaCuRo NIN promoter enabled its regulation via Cyclops and laid the foundation for IT formation in cortical cells.

Discussion

The mechanistic connection between PACE and cortical IT formation together with their congruent phylogenetic distribution strongly support the idea that the acquisition of PACE by the latest common ancestor of the FaFaCuRo clade enabled cortical ITs and thus laid the foundation for the evolution of present day RNS. Our findings support an evolutionary model in which an ancestral symbiotic transcription factor complex (comprising CCaMK and Cyclops), which facilitated intracellular symbiosis with AM fungi already in the earliest land plants53,54, gained control over the transcriptional regulation of the NIN gene by the acquisition of PACE (Fig. 1). This genetic innovation in the last common ancestor of the FaFaCuRo clade extended the function of the ancestral CCaMK complex to initiate cortical IT development. The NLP family underwent important evolutionary steps preceding the origin of RNS including a gene duplication leading to NIN and NLP1 as closest paralogues55. It is very likely that the NIN protein itself underwent changes that enabled its role in nodulation35. Loss-of-NIN events associated with the loss of nodulation are scattered across all four FaFaCuRo orders15,16, suggesting that NIN acquired its relevance for nodulation probably before or latest in the last common ancestor. As our phylogenomic analysis dates the acquisition of PACE to the latest ancestor of the FaFaCuRo clade, we conclude that the critical changes within NIN must have occurred simultaneously or earlier. From a statistical point of view, it is likely that the PACE acquisition and the RNS-enabling changes within NIN occurred independently from each other. It will be interesting to determine what these critical changes within NIN are and where they occurred phylogenetically.

A ‘young’ primary cell wall characteristic for recently divided cells is considered an important prerequisite for cortical IT initiation6,56, but cell division is not restricted to the formation of novel organs9. It is therefore conceptually possible that the common ancestor of the FaFaCuRo clade was forming ITs in recently divided cortical cells but in the absence of root nodules. Multiple lines of evidence indicate that the diverse types of lateral organ harbouring nitrogen-fixing bacteria (‘nodules’) evolved multiple times independently. Indeed, CE-mediated NIN expression is important for nodule organogenesis in legumes, but upon searching for this regulatory element in a region of 0.1 Mb upstream and downstream of the NIN gene, Liu et al.40 found its presence to be restricted to legume species, indicating an evolutionary emergence independently of and considerably later than the last common ancestor40,55. ITs in root hairs are only found in Fabales and Fagales and therefore are also considered a more recent acquisition6,57. CE only in combination with PACE facilitates root hair ITs (Extended Data Figs. 68 and Supplementary Figs. 6 and 7) and additional elements in the 3 kb promoter are necessary for nodule and cortical IT development (Extended Data Figs. 68 and Supplementary Figs. 6 and 7). Furthermore, deletion of PACE by targeted genome editing was recently reported to reduce but not completely abolish the formation of root hair ITs, suggesting the presence of partly redundant PACE elements in the vicinity of the LjNIN gene58. Altogether, these observations highlight the complexity and concerted activity of cis-elements and transcription factors underlying the spatiotemporal expression control by present day NIN promoters in RNS-competent species. Our data pinpoint the acquisition of PACE as a key event during the evolution of the nitrogen-fixing RNS. Together with our discovery that multiple independent losses of PACE are associated with multiple losses of RNS within the FaFaCuRo clade, our data underpin the essential position of PACE in the evolutionary gain and loss of RNS.

Methods

Bioinformatic analyses

On the basis of the phylogenetic classification of the RWP-RK gene family15,50, 144 NIN/NLP genes were selected from 37 plant species and 13 orders ranging from monocotyledons to dicotyledons including the FaFaCuRo clade (Supplementary Table 1). For each selected gene, 3 kb of sequence upstream of the translational start site including the promoter and 5′ UTR region was defined and extracted from the corresponding species’ genomic sequence, if the contig length allowed it. For Medicago truncatula, a 3,352-bp sequence upstream of the translational start site was extracted. If contig length was limiting, the longest possible sequence stretch was extracted. For the identification of a cis-regulatory element specific for NIN promoters of the FaFaCuRo clade, the tool MEME59 was used in discriminative mode (‘search given strand only’ option, default parameters) with NIN-promoter regions of only nodulating plants. The control group consisted of promoter regions of all NIN genes outside of the FaFaCuRo clade and all NLP genes listed in Supplementary Table 1. The highest-scoring motif (e= 1.6 × 10−58) was 27-bp long and contained the much shorter previously described CYC-box34 (Extended Data Fig. 1a).

To refine the conserved region in this motif, MEME analysis was performed again in normal mode (‘search given strand only’ option, default parameters) with NIN promoters from only nodulating species. This analysis revealed that the most conserved nucleotides are found within 29 nucleotides (nucleotides 10 to 38 in Extended Data Fig. 1b). The previous MEME analysis was repeated, but an exactly 29-nucleotide-long motif was searched for (resulting in a motif in Extended Data Fig. 1c), and the best-scoring NIN paralogue per searched species, that is, the lowest Pvalue per species, were identified (Supplementary Table 1). In a final step, one best-scoring NIN-promoter region per nodulating species were analysed with MEME (‘search given strand only’ option, default parameters) by searching for an exactly 29-nucleotide-long motif. The resulting motif was named PACE (Extended Data Fig. 1d). This final MEME run was done for two reasons: first, to avoid a sequence bias towards a single species with multiple NIN paralogue promoter regions (for example, soybean) and, second, to avoid a potential sequence bias generated by the promoter region of a NIN paralogue that might be no longer functional and therefore has mutated sites in its promoter region owing to relaxed selection pressure.

As a control, the FIMO59 tool was used (‘scan given strand only’ option, default parameters, false discovery rate < 0.1) to search all 144 NIN and NLP promoter regions (Supplementary Table 1). PACE was found within NIN-promoter regions of all nodulating species analysed and two non-nodulating FaFaCuRo species (Prunus persica and Ziziphus jujuba) (Supplementary Table 1).

The presence or absence of PACE was further investigated in promoters of NIN and NLPs in an expanded database of 163 species encompassing 39 orders covering six groups of Viridiplantae (Supplementary Table 2). Orthologues of the whole NLP family were retrieved using tBLASTn v2.11.0+60 with reference sequences from Medicago truncatula as query and a cut-off evalue of 1 × 10−10. Sequences were then aligned using MAFFT v7.38061 with default parameters. To identify the NIN and NLPs orthologues and therefore resolve the NIN and NLP protein subfamilies, we used a maximum likelihood approach using the IQ-TREE v1.6.7 software62. Before phylogenetic reconstruction, the best-fitting evolution model was determined for each alignment using ModelFinder63 as implemented in IQ-TREE. Branch support was tested using 10,000 replicates of UltraFast Bootstraps using UFBoot264. For each identified orthologue, a 5-kb region upstream of the translational start site was extracted. The three different consensuses identified in the previous MEME analyses (Extended Data Fig. 1a,b,d) were then searched in all NLP upstream regions using FIMO 5.0.2 and a q-value threshold of 0.1. If several motifs were identified in a given upstream region, only the one with the lowest q-value was conserved for further analysis (Extended Data Fig. 1e, Supplementary Fig. 1 and Supplementary Table 2). PACE from Parasponia andersonii (a nodulating species from Rosales) was identified in this analysis and included to generate the consensus in Fig. 1.

The promoters of ERN1 and RAM1 genes were analysed independently of the previous analysis, using 87 plant genomes covering the main Angiosperms orders (Supplementary Table 5). Orthologues of each gene were retrieved using tBLASTn v2.7.1+60 with reference sequences from Medicago truncatula as query and a cut-off e-value of 1 × 10−10. The sequences were then aligned using MAFFT v7.38061 with default parameters. The alignments were subjected to phylogenetic analysis to identify orthologs using maximum likelihood approach and the IQ-TREE v1.6.7 software62. Before phylogenetic reconstruction, the best-fitting evolution model was determined for each alignment using ModelFinder63 as implemented in IQ-TREE. The branch support was tested using 10,000 replicates of UltraFast Bootstraps using UFBoot264. For each identified orthologue, the regions upstream of the translational start site of different lengths (1 to 5 kb) were extracted. For each length of region upstream of the translational start site, the sequences were analysed using MEME software v5.0.1(ref. 59) with the following parameters: a motif size between 5 and 45 bp and 5 and 25 bp for ERN1 and RAM1, respectively, and a maximum number of discovered motifs of 20 (Supplementary Figs. 2 and 3). In addition, MEME search was set on ‘zoops’ mode, assuming that each sequence can contain zero or one occurrence of the motif.

Biological material

L. japonicus ecotype Gifu B-129 wild-type65, nin-232 and nin-15 (LORE1 line 3000352966) were used in this study. Seed bags, bacterial strains and days post inoculation for each experiment are listed in Supplementary Table 6.

Plant growth conditions and symbiotic inoculations

Lotus japonicus seeds were scarified and surface-sterilized as described67 before germination on ½ Gamborg’s B5 medium solidified with 0.8% Bacto agar in square plates (12 cm ×12 cm × 1.7 cm)68. The lates were kept in dark for 3 days before transferring to light condition in a Panasonic growth cabinet (MLR-352H-PE) at 24 °C under a 16 h–8 h light–dark regime (50 µmol m−2 s−1). The 6-day-old seedlings were (1) subject to hairy root transformation as described69 (Figs. 24, Extended Data Figs. 410 and Supplementary Figs. 5–8) or (2) transferred to Weck jars (SKU 745 or 743; J.Weck GmbH u. Co. KG) containing 300 ml of sand:vermiculite mixture (2:1) and 20 ml of a modified ¼ strength Hoagland’s medium with Fe-EDDHA used as iron source70 (Extended Data Fig. 5e). For in vivo promoter expression analysis (Fig. 2 and Supplementary Fig. 5), transgenic roots expressing a kanamycin-resistance gene were kept on square plates supplemented with kanamycin (25 µg ml−1) 10 days after the Agrobacterium rhizogenes inoculation. Plants with transformed roots were kept on 0.8% Bacto agar including a nitrogen-reduced version of FAB medium (500 µM MgSO4·7H2O, 250 µM KH2PO4, 250 µM KCl, 250 µM CaCl2·2H2O, 100 µM KNO3, 25 µM Fe-EDDHA, 50 µM H3BO3, 25 µM MnSO4·H2O, 10 µM ZnSO4·7H2O, 0.5 µM Na2MoO4·2H2O, 0.2 µM CuSO4·5H2O, 0.2 µM CoCl2·6H2O; pH 5.7) in square plates for 1 week before transferring to a growth chamber at 24 °C under a 16 h–8 h light–dark regime (275 µmol m−2 s−1) in Weck jars (SKU 745 or 743) containing 300 ml of sand:vermiculite mixture (2:1) and 30 ml of nitrogen-reduced FAB medium containing Mesorhizobium loti MAFF 303099 DsRed71 (M. loti DsRed; Figs. 3 and 4, Extended Data Figs. 510 and Supplementary Figs. 6 and 7), M. loti R7A CFP72 (Fig. 2) or M. loti MAFF 303099 GFP73 (Supplementary Fig. 5) set to a final optical density at 600 nm (OD600) of 0.05. For Extended Data Fig. 4 and Supplementary Fig. 8, plants were grown in Weck jars (SKU 745 or 743) containing 300 ml of sand:vermiculite mixture (2:1) and 60 ml of nitrogen-reduced FAB medium containing M. loti DsRed or MAFF 303099 lacZ72 (M. loti lacZ) (OD600of 0.01).

Cloning and DNA constructs

For the construction of promoter:NIN fusions for complementation experiments (Fig. 3, Extended Data Figs. 610 and Supplementary Figs. 6and 7), the NIN genomic sequence without the 5′ and 3′ UTRs served as a cloning module. A 3-kb region of the L. japonicus NIN promoter plus the 244 bp NIN 5′ UTR was cloned from L. japonicus Gifu and used for complementation experiments (Fig. 3, Extended Data Figs. 69 and Supplementary Figs. 6 and 7), dual-luciferase assays (Extended Data Fig. 2), fluorimetric GUS assay (Extended Data Fig. 3) and promoter activity analysis (Fig. 2, Extended Data Fig. 4 and Supplementary Fig. 5). For all the other versions of the L. japonicus NIN promoter tested (Figs. 24, Extended Data Figs. 310 and Supplementary Figs. 58), the LjNIN minimal promoter (98 bp)34 plus the LjNIN 5′ UTR was fused to the 3′ end of the promoter. A 472-bp region containing multiple cytokinin response elements and highly conserved in eight legume species was identified 5′ of the NIN transcriptional start site by Liu et al.40. We used this conserved region of 472 bp from L. japonicus and added flanking regions (192 bp upstream and 366 bp downstream and 2,399 bp upstream and 2,231 bp downstream, respectively) to obtain cytokinin element-containing regions of 1 kb and 5 kb (CE1kb and CE5kb, respectively). The S. lycopersicum gene ID Solyc01g112190.2.1 was identified as the closest homologue of LjNIN gene on the basis of phylogenetic analysis15 and is referred to as SlNIN. A 3-kb region of the SlNIN promoter plus the 238-bp SlNIN 5′ UTR was cloned from S. lycopersicum cv. ‘Moneymaker’ and PACE or mPACE (Extended Data Fig. 3a) was inserted 184 bp upstream of the SlNIN 5′ UTR and used for complementation experiments (Fig. 4 and Extended Data Fig. 10), dual-luciferase assays (Extended Data Fig. 2) and fluorimetric GUS assay (Extended Data Fig. 3). A detailed description of constructs can be found in Supplementary Fig. 4 and Supplementary Table 7. A list of oligonucleotides can be found in Supplementary Table 8. The constructs were generated with the Golden Gate cloning system74.

Imaging

Microscope and scanner settings as well as parameters for image acquisition are listed in Supplementary Table 9.

Phenotypic analysis and quantification of infection events

Infected and non-infected nodules were discriminated by the presence and absence of a DsRed signal (representing M. loti DsRed) detected or not detected inside of the nodules, respectively. The presence or absence of bacteria was later confirmed by examination of sections of representative nodules. ITs and M. loti entrapments in root hairs were detected by their DsRed fluorescence (see Supplementary Table 9 for microscope settings).

For the phenotypic analysis of nin-15 (Extended Data Fig. 5c,f), quantification was performed 21 dpi with M. loti DsRed as follows: (1) the total number of nodules (including infected and non-infected) was determined under white light illumination, and (2) the number of infected nodules and root hair ITs were counted as described above. Shoot dry weight was measured after drying the shoot at 60 °C for 1 h (Extended Data Fig. 5e).

For the complementation experiments of nin-2 and nin-15 (Figs. 3 and 4, Extended Data Figs. 610 and Supplementary Figs. 6 and 7), quantifications and sectioning were performed 21 or 35 dpi with M. loti DsRed with the microscope settings listed in Supplementary Table 9 in the following order: (1) transgenic roots were identified by GFP fluorescence-emanating nuclei with a GFP filter, (2) infected nodules were counted as described above, (3) the total number of nodules (including infected and non-infected ones) was then determined under white light illumination and (4) the number of non-infected nodules was calculated by subtracting the number of infected nodules from the total number of nodules. To quantify infection events in root hairs, the number of bacterial entrapment and ITs in root hairs were counted on a 0.5-cm root piece for each transgenic root system, excised from a region where bacterial accumulation was detected by DsRed fluorescence. Sectioning was performed on non-infected and infected nodules, and the presence/absence of ITs and symbiosomes in cortical cells was examined. Nodule primordia and nodules were embedded in 6% low-melting agarose and sliced into 40–50-µm thick sections using a vibrating-blade microtome (Leica VT1000 S).

Transient expression in Nicotiana benthamiana leaves

Agrobacterium tumefaciens strain AGL1 carrying promoter:reporter fusions on T-DNA were infiltrated as previously described75 with the acetosyringone concentration in the infiltration buffer modified to 150 µM. A. tumefaciens strains AGL1 and GV3101 containing plasmids 35Spro:3xHA-Cyclops34 and 35Spro:CCaMK1–314-mOrange76, respectively, were co-infiltrated with the reporter constructs as indicated. An AGL1 strain carrying a K9 plasmid constitutively expressing red fluorescent protein was used as needed to equalize the density of the A. tumefaciens suspension infiltrated per leaf, together with an A. tumefaciens strain carrying a plasmid for the expression of the viral P19 silencing suppressor to reduce post-transcriptional gene silencing77 (Extended Data Figs. 2 and 3). N. benthamiana leaf discs with a diameter of 0.5 cm were harvested 60 h post infiltration and used for quantitative fluorometric GUS assay and dual-luciferase assay.

Dual-luciferase assay

The dual-luciferase assay (Extended Data Fig. 2) was based on the Dual-Luciferase reporter assay system (Promega). N. benthamiana leaf discs were ground to a fine powder in liquid nitrogen in 2 ml Eppendorf Safe-Lock tube (one leaf disc in each tube) and subsequently incubated for 5 min at room temperature with 200 µl of the Passive lysis buffer (Promega E1910). The resulting crude leaf extract was centrifuged at 20,000g for 2 min at room temperature. An aliquot of the supernatant was subjected to the dual-luciferase assay according to manufacturer’s instruction for Promega Dual-Luciferase Kit (E1910) and chemiluminescence was quantified with a fluorescence plate reader (TECAN Infinite 200 PRO; TECAN Group) in white 96-well plates (Greiner Bio-One International). For each reporter construct, the promoter of interest was fused to the Firefly luciferase gene, and constitutively expressed Renilla luciferase from the same vector was used for normalization (Supplementary Table 7). The ratio of the two signals (Firefly luciferase signal to the Renilla luciferase) was calculated and normalized to the vector control. A total number of at least four biological and two technical replicates per indicated vector were analysed in two independently performed assays.

Quantitative fluorometric GUS assay and analysis

Quantitative fluorometric GUS assays (Extended Data Fig. 3) were performed as described78 adapted to the 96-well format. A total number of seven to eight leaf discs per indicated vector combination were analysed in two assays independently performed in different weeks.

Promoter activity analysis

For promoter activity analyses with the GUS reporter gene (Extended Data Fig. 4a–f and Supplementary Fig. 8), transgenic nodule primordia and nodules were excised 10–14 dpi or ≥21 dpi with M. loti DsRed and stained for GUS activity using 5-bromo-4-chloro-3-indolyl-β-d-glucuronic acid (X-Gluc; x-gluc.com) as catalytic substrate75 for 3 h at 37 °C. To visualize the root hair ITs together with the promoter activity with the GUS reporter gene (Extended Data Fig. 4g), plants with transgenic root systems were inoculated with M. loti lacZ. Transgenic roots were first stained for GUS activity with X-Gluc for 3 h at 37 °C and then for lacZ expression with Magenta-Gal for 18 h at 28 °C (as described in ref. 75, which were visualized in blue and purple colours after staining, respectively). For promoter activity analyses with fluorescent reporters (Fig. 2 and Supplementary Fig. 5), transgenic root systems were harvested 7 dpi (Supplementary Fig. 5) or 10–14 dpi (Fig. 2). Roots with bacterial infection at stage 2 or 3 and nodule primordia with bacterial infection at stage 3 or 4 (see main text for stage description) were selected by locating the GFP (M. loti MAFF 303099 GFP) and CFP signal (M. loti R7A CFP), respectively, via rapid (around 10 s) Z-stack analysis with the confocal light scanning microscope (Supplementary Table 9). For cell wall staining with calcofluor white (Supplementary Fig. 5), roots were fixed with 4% formaldehyde dissolved in 50 mM piperazine-N,N′-bis(2-ethanesulphonic acid) (PIPES) buffer (pH 7) for 1 h under vacuum, rinsed three times with PIPES buffer and incubated in 0.05% calcofluor white dissolved in H2O for 1 h. The roots and sections of nodule primordia were imaged as described in Supplementary Table 9.

Data visualization and statistical analysis

Statistical analyses and data visualization were performed with RStudio 1.1. 383 (RStudio Inc.). Box plots were used to display data in Fig. 4, Extended Data Figs. 2, 3 and 510 and Supplementary Fig. 6 (thick black or white lines, median; box, interquartile range (IQR); whiskers, lowest and highest data point within 1.5× IQR; black-filled circles, data points inside 1.5× IQR; white-filled circles, data points outside 1.5× IQR of the upper/lower quartile). The R package beeswarm with the method ‘center’ was used to plot the individual data points for the box plots79. The R package agricolae was used to perform ANOVA statistical analysis with post hoc Tukey, and statistical results are displayed in small letters where different letters indicate statistical significance80. The tests applied are stated in the figure legends.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Supplementary information

Supplementary Information (4.3MB, pdf)

Supplementary Figs. 1–8.

Reporting summary (2.1MB, pdf)
Supplementary Table 1 (22.5KB, xlsx)

Summary of the bioinformatic analysis resulting in the discovery of PACE using 37 species.

Supplementary Table 2 (261.3KB, xlsx)

Results of the FIMO analysis of PACE in 163 species.

Supplementary Table 3 (11.9KB, xlsx)

Status of PACE and NIN in non-nodulating FaFaCuRo species. The species highlighted in grey are non-nodulating FaFaCuRo species that possess a full-length NIN open reading frame.

Supplementary Table 4 (20.7KB, xlsx)

Results of hairy root mediated complementation experiments of the L. japonicus Gifu nin-2 and nin-15 mutant lines with indicated constructs.

Supplementary Table 5 (25.7KB, xlsx)

List of plant genomes used for the search of conserved motifs within ERN1 and RAM1 promoters.

Supplementary Table 6 (12KB, xlsx)

List of seed bags, bacterial strains and incubation times.

Supplementary Table 7 (26.7KB, xlsx)

List of plasmids used.

Supplementary Table 8 (12.2KB, xlsx)

Sequences and IDs of oligonucleotides (DNA) used.

Supplementary Table 9 (11.2KB, xlsx)

Microscope/scanner settings and image analysis.

Acknowledgements

We thank D. Chiasson for providing the MAFF 303099 lacZ and R7A CFP M. loti strains and N. Sandal and J. Stougaard for providing the nin-15 mutant. M.P. received funding from the European Research Council (ERC) under the European Union’s Seventh Framework Programme (grant no. FP7/2007-2013) under grant agreement no. 340904 (EvolvingNodules), which supported the work of M.G. and X.G. M.P. acknowledges funding from the German Research Foundation (DFG) in the context of the SFB924 ‘Molecular mechanisms regulating yield and yield stability in plants’, grant no. 170483403, which supported the work of R.E.A. and C.C., and the ANR-DFG project ‘COME-IN’ grant no. 258665719, which supported the work of C.C. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement no. H2020-MSCA-IF-2015-703186 to K.V. and postdoctoral fellowship from the Alexander von Humboldt Foundation to K.V. P.M.D. and J.K. belong to the LRSV laboratory, which is part of the TULIP Laboratoire d’Excellence (LABEX) (grant no. ANR-10-LABX-41). M.H. received funding from the JSPS KAKENHI grant no. 17H06472.

Extended data

Extended Data Fig. 7. The CYC-box and flanking sequences of PACE are required for the complete restoration of the bacterial infection process but are dispensable for the nodule organogenesis process in the L. japonicus nin-2 mutant.

Extended Data Fig. 7

nin-2 roots were transformed with T-DNAs carrying a Ubq10pro:NLS-GFP transformation marker in tandem with the LjNIN gene driven by either of the following promoter versions: the cytokinin element-containing region of 5 kb (CE5kb) fused to the 3 kb LjNIN promoter (CE5kb:NINpro); CE5kb:NINpro with PACE mutated (CE5kb:NINpro::mPACE); CE5kb:NINpro carrying a mutated Cyclops binding site (CYC-box) (CE5kb:NINpro::mbox); CE5kb:NINpro carrying mutated sequences flanking the CYC-box in PACE (CE5kb:NINpro::mflanking); CE5kb fused to the LjNIN minimal promoter (CE5kb:NINminpro); CE5kb fused to PACE and to NINminpro (CE5kb:PACE:NINminpro) or NINpro. (A) Representative overview pictures of transgenic root systems. Roots were analysed 21 dpi with M. loti DsRed. White asterisks and arrowheads: infected and non-infected nodules, respectively. Bars, 2 mm. (B) Boxplots displaying the number of infected nodules, the percentage of infected nodules among total organogenesis events (sum of infected and non-infected nodules) and the number of organogenesis events. (C) Boxplots displaying the number of root hair ITs and the percentage of root hair ITs among total infection events (sum of bacterial entrapments and ITs). Each dot represents one transgenic nin-2 root system or root piece. L. japonicus WT roots transformed with NINpro:NIN or CE5kb:NINpro:NIN were included as controls. Note that the mutation of PACE or only the CYC-box in PACE led to an almost complete loss of IT formation and infected nodules per root system while nodule organogenesis was not significantly reduced; and that mutation of sequences flanking the CYC-box in PACE led to a reduction of the number of infected nodules per root systems. n: number of transgenic root systems or root pieces analysed. Thick black lines, median; box, interquartile range; whiskers, lowest and highest data point within 1.5 interquartile range (IQR); black filled circles, data points inside 1.5 IQR; white filled circles, data points outside 1.5 IQR of the upper/lower quartile. Data are from a single experiment. n.a.: not applicable. WLI: white light illumination.

Author contributions

M.G. performed the bioinformatic analysis of NIN promoters and discovered PACE presented in Fig. 1 and Extended Data Fig. 1a–d. C.C. performed in vivo expression analysis presented in Fig. 2 and Supplementary Fig. 5 and prepared all confocal and light microscopy images of root hairs and nodule sections (Figs. 3 and 4, Extended Data Figs. 8 and 10 and Supplementary Fig. 7). Complementation experiments of nin-15 were performed by R.E.A. (Figs. 3b–d and 4a,c and Extended Data Figs. 9 and 10a,b), C.C. (Figs. 3 and 4 and Extended Data Figs. 9c and 10) and X.G. (Figs. 3d and 4b–c and Extended Data Fig. 9 and 10c,d). Complementation experiments of nin-2 were performed by C.C. and X.G. (Fig. 3a, Extended Data Figs. 68 and Supplementary Figs. 6 and 7). R.E.A., C.C. and X.G. performed nin-15 mutant phenotyping (Extended Data Fig. 5). X.G drafted Supplementary Fig. 4 and performed transient expression assays presented in Extended Data Fig. 3 and promoter expression analysis in Extended Data Fig. 4 and Supplementary Fig. 8. K.V. performed transient expression assays presented in Extended Data Fig. 2 and drafted Fig. 1. J.K. and P.M.D. performed motif search in ERN1, NIN and RAM1 promoters presented in Extended Data Fig. 1e and Supplementary Figs. 13. M.H. formulated the research hypothesis. C.C., X.G., R.E.A., K.V., M.G. and M.P. designed experiments. M.P. conceived and supervised the project. X.G. and M.P. coordinated research activities. M.P., C.C. and X.G. wrote the manuscript, and X.G. finalized all figures with inputs and comments from co-authors.

Peer review

Peer review information

Nature Plants thanks Ton Bisseling, Takashi Soyano, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Funding

Open access funding provided by Ludwig-Maximilians-Universität München.

Data availability

Raw data corresponding to Fig. 1, Extended Data Fig. 1 and Supplementary Figs. 1–3 are available in Supplementary Tables 1–3 and 5. The remaining raw data are available upon request. Essential plasmids listed in Supplementary Table 7 can be ordered from the European Plasmid Repository (https://www.plasmids.eu/). References for the L. japonicus lines and M. loti strains are indicated in Supplementary Table 6.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Chloé Cathebras, Xiaoyun Gong.

Extended data

is available for this paper at 10.1038/s41477-025-02161-z.

Supplementary information

The online version contains supplementary material available at 10.1038/s41477-025-02161-z.

References

  • 1.LeBauer, D. & Treseder, K. Nitrogen limitation of net primary productivity in terrestrial ecosystems is globally distributed. Ecology89, 371–379 (2008). [DOI] [PubMed] [Google Scholar]
  • 2.Masson-Boivin, C. & Sachs, J. L. Symbiotic nitrogen fixation by rhizobia—the roots of a success story. Curr. Opin. Plant Biol.44, 7–15 (2018). [DOI] [PubMed] [Google Scholar]
  • 3.Peters, G. & Meeks, J. The AzollaAnabaena symbiosis: basic biology. Annu. Rev. Plant Physiol. Plant Mol. Biol.40, 193–210 (1989). [Google Scholar]
  • 4.Parniske, M. Intracellular accommodation of microbes by plants: a common developmental program for symbiosis and disease? Curr. Opin. Plant Biol.3, 320–328 (2000). [DOI] [PubMed] [Google Scholar]
  • 5.Kistner, C. & Parniske, M. Evolution of signal transduction in intracellular symbiosis. Trends Plant Sci.7, 511–518 (2002). [DOI] [PubMed] [Google Scholar]
  • 6.Parniske, M. Uptake of bacteria into living plant cells, the unifying and distinct feature of the nitrogen-fixing root nodule symbiosis. Curr. Opin. Plant Biol.44, 164–174 (2018). [DOI] [PubMed] [Google Scholar]
  • 7.Soltis, D. E. et al. Chloroplast gene sequence data suggest a single origin of the predisposition for symbiotic nitrogen fixation in angiosperms. Proc. Natl Acad. Sci. USA92, 2647–2651 (1995). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Finan, T. M. et al. Symbiotic mutants of Rhizobium meliloti that uncouple plant from bacterial differentiation. Cell40, 869–877 (1985). [DOI] [PubMed] [Google Scholar]
  • 9.Murray, J. D. et al. A cytokinin perception mutant colonized by Rhizobium in the absence of nodule organogenesis. Science305, 101–104 (2007). [DOI] [PubMed] [Google Scholar]
  • 10.Sprent, J. I., Ardley, J. & James, E. K. Biogeography of nodulated legumes and their nitrogen-fixing symbionts. N. Phytol.215, 40–56 (2017). [DOI] [PubMed] [Google Scholar]
  • 11.Pawlowski, K. & Demchenko, K. N. The diversity of actinorhizal symbiosis. Protoplasma249, 967–979 (2012). [DOI] [PubMed] [Google Scholar]
  • 12.Doyle, J. J. Phylogenetic perspectives on the origins of nodulation. Mol. Plant Microbe Interact.24, 1289–1295 (2011). [DOI] [PubMed] [Google Scholar]
  • 13.Doyle, J. J. Chasing unicorns: nodulation origins and the paradox of novelty. Am. J. Bot.103, 1865–1868 (2016). [DOI] [PubMed] [Google Scholar]
  • 14.Sprent, J. I. Evolving ideas of legume evolution and diversity: a taxonomic perspective on the occurrence of nodulation: Tansley review. N. Phytol.174, 11–25 (2007). [DOI] [PubMed] [Google Scholar]
  • 15.Griesmann, M. et al. Phylogenomics reveals multiple losses of the nitrogen-fixing root nodule symbiosis. Science13, eaat1743 (2018). [DOI] [PubMed] [Google Scholar]
  • 16.van Velzen, R. et al. Comparative genomics of the nonlegume Parasponia reveals insights into evolution of nitrogen-fixing rhizobium symbioses. Proc. Natl Acad. Sci. USA115, E4700–E4709 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.van Velzen, R., Doyle, J. J. & Geurts, R. A resurrected scenario: single gain and massive loss of nitrogen-fixing nodulation. Trends Plant Sci.24, 49–57 (2019). [DOI] [PubMed] [Google Scholar]
  • 18.Gage, D. J. Analysis of infection thread development using Gfp- and DsRed-expressing Sinorhizobium meliloti. J. Bacteriol.184, 7042–7046 (2002). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Carvalho, T. L. G., Balsemão-Pires, E., Saraiva, R. M., Ferreira, P. C. G. & Hemerly, A. S. Nitrogen signalling in plant interactions with associative and endophytic diazotrophic bacteria. J. Exp. Bot.65, 5631–5642 (2014). [DOI] [PubMed] [Google Scholar]
  • 20.Johansson, C. & Bergman, B. Early events during the establishment of the Gunnera/Nostoc symbiosis. Planta188, 403–413 (1992). [DOI] [PubMed] [Google Scholar]
  • 21.Wittkopp, P. & Kalay, G. Cis-regulatory elements: molecular mechanisms and evolutionary processes underlying divergence. Nat. Rev. Genet13, 59–69 (2011). [DOI] [PubMed] [Google Scholar]
  • 22.Kvon, E. Z. et al. Progressive loss of function in a limb enhancer during snake evolution. Cell167, 633–642 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Guo, X. et al. Chloranthus genome provides insights into the early diversification of angiosperms. Nat. Commun.12, 1–14 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Moore, M. J., Soltis, P. S., Bell, C. D., Burleigh, J. G. & Soltis, D. E. Phylogenetic analysis of 83 plastid genes further resolves the early diversification of eudicots. Proc. Natl Acad. Sci. USA107, 4623–4628 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Remy, W., Taylort, T. N., Hass, H. & Kerp, H. Four hundred-million-year-old vesicular arbuscular mycorrhizae. Proc. Natl Acad. Sci. USA91, 11841–11843 (1994). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Taylor, T. N., Taylor, E. L. & Krings, M. Paleobotany: the Biology and Evolution of Fossil Plants (Academic, 2008).
  • 27.Yano, K. et al. CYCLOPS, a mediator of symbiotic intracellular accommodation. Proc. Natl Acad. Sci. USA105, 20540–20545 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Gutjahr, C. et al. Arbuscular mycorrhiza-specific signaling in rice transcends the common symbiosis signaling pathway. Plant Cell20, 2989–3005 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Markmann, K., Giczey, G. & Parniske, M. Functional adaptation of a plant receptor-kinase paved the way for the evolution of intracellular root symbioses with bacteria. PLoS Biol.6, 0497–0506 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Banba, M. et al. Divergence of evolutionary ways among common sym genes: CASTOR and CCaMK show functional conservation between two symbiosis systems and constitute the root of a common signaling pathway. Plant Cell Physiol.49, 1659–1671 (2008). [DOI] [PubMed] [Google Scholar]
  • 31.Chen, C., Gao, M., Liu, J. & Zhu, H. Fungal symbiosis in rice requires an ortholog of a legume common symbiosis gene encoding a Ca2+/calmodulin-dependent protein kinase. Plant Physiol.145, 1619–1628 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Schauser, L., Roussis, A., Stiller, J. & Stougaard, J. A plant regulator controlling development of symbiotic root nodules. Nature402, 191–195 (1999). [DOI] [PubMed] [Google Scholar]
  • 33.Soyano, T., Kouchi, H., Hirota, A. & Hayashi, M. NODULE INCEPTION directly targets NF-Y bubunit genes to regulate essential processes of root nodule development in Lotus japonicus. PLoS Genet.9, e1003352 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Singh, S., Katzer, K., Lambert, J., Cerri, M. & Parniske, M. CYCLOPS, A DNA-binding transcriptional activator, orchestrates symbiotic root nodule development. Cell Host Microbe15, 139–152 (2014). [DOI] [PubMed] [Google Scholar]
  • 35.Soyano, T. & Hayashi, M. Transcriptional networks leading to symbiotic nodule organogenesis. Curr. Opin. Plant Biol.20, 146–154 (2014). [DOI] [PubMed] [Google Scholar]
  • 36.Cerri, M. R. et al. The ERN1 transcription factor gene is a target of the CCaMK/CYCLOPS complex and controls rhizobial infection in Lotus japonicus. N. Phytol.215, 323–337 (2017). [DOI] [PubMed] [Google Scholar]
  • 37.Pimprikar, P. et al. A CCaMK–CYCLOPS–DELLA complex activates transcription of RAM1 to regulate arbuscule branching. Curr. Biol.26, 987–998 (2016). [DOI] [PubMed] [Google Scholar]
  • 38.Vernié, T. et al. The NIN transcription factor coordinates diverse nodulation programs in different tissues of the Medicago truncatula root. Plant Cell27, 3410–3424 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Yoro, E. et al. A positive regulator of nodule organogenesis, NODULE INCEPTION, acts as a negative regulator of rhizobial infection in Lotus japonicus. Plant Physiol.165, 747–758 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Liu, J. et al. A remote cis-regulatory region is required for NIN expression in the pericycle to initiate nodule primordium formation in Medicago truncatula. Plant Cell31, 68–83 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Buecker, C. & Wysocka, J. Enhancers as information integration hubs in development: lessons from genomics. Trends Genet.28, 276–284 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Perrine-Walker, F. M., Lartaud, M., Kouchi, H. & Ridge, R. W. Microtubule array formation during root hair infection thread initiation and elongation in the MesorhizobiumLotus symbiosis. Protoplasma251, 1099–1111 (2014). [DOI] [PubMed] [Google Scholar]
  • 43.Van Spronsen, P. C., Grønlund, M., Bras, C. P., Spaink, H. P. & Kijne, J. W. Cell biological changes of outer cortical root cells in early determinate nodulation. Mol. Plant Microbe Interact.14, 839–847 (2001). [DOI] [PubMed] [Google Scholar]
  • 44.Yoon, H. J. et al. Lotus japonicus SUNERGOS1 encodes a predicted subunit A of a DNA topoisomerase VI that is required for nodule differentiation and accommodation of rhizobial infection. Plant J.78, 811–821 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Ott, T. et al. Symbiotic leghemoglobins are crucial for nitrogen fixation in legume root nodules but not for general plant growth and development. Curr. Biol.15, 531–535 (2005). [DOI] [PubMed] [Google Scholar]
  • 46.van de Wiel, C. A Histochemical Study of Root Nodule Development. PhD thesis, Wageningen University and Research (1991).
  • 47.Hirsch, S. et al. GRAS proteins form a DNA binding complex to induce gene expression during nodulation signaling in Medicago truncatula. Plant Cell21, 545–557 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Xiao, A. et al. Transcriptional regulation of NIN expression by IPN2 is required for root nodule symbiosis in Lotus japonicus. N. Phytol.227, 513–528 (2020). [DOI] [PubMed] [Google Scholar]
  • 49.Zhu, H. et al. A novel ARID DNA-binding protein interacts with SymRK and is expressed during early nodule development in Lotus japonicus. Plant Physiol.148, 337–347 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Clavijo, F. et al. The Casuarina NIN gene is transcriptionally activated throughout Frankia root infection as well as in response to bacterial diffusible signals. N. Phytol.208, 887–903 (2015). [DOI] [PubMed] [Google Scholar]
  • 51.D’Haeze, W. et al. Roles for azorhizobial nod factors and surface polysaccharides in intercellular invasion and nodule penetration, respectively. Mol. Plant Microbe Interact.11, 999–1008 (1998). [Google Scholar]
  • 52.De Faria, S. M., Hay, G. T. & Sprent, J. I. Entry of rhizobia into roots of Mimosa scabrella Bentham occurs between epidermal cells. Microbiology134, 2291–2296 (1988). [Google Scholar]
  • 53.Delaux, P. M. et al. Comparative phylogenomics uncovers the impact of symbiotic associations on host genome evolution. PLoS Genet.10, e1004487 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Delaux, P. M. et al. Algal ancestor of land plants was preadapted for symbiosis. Proc. Natl Acad. Sci. USA112, 13390–13395 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Liu, J. & Bisseling, T. Evolution of NIN and NIN-like genes in relation to nodule symbiosis. Genes11, 1–15 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Geurts, R., Xiao, T. T. & Reinhold-Hurek, B. What does it take to evolve a nitrogen-fixing endosymbiosis?. Trends Plant Sci.21, 199–208 (2016). [DOI] [PubMed] [Google Scholar]
  • 57.Madsen, L. H. et al. The molecular network governing nodule organogenesis and infection in the model legume Lotus japonicus. Nat. Commun. 1, 10 (2010). [DOI] [PMC free article] [PubMed]
  • 58.Akamatsu, A., Nagae, M. & Takeda, N. The CYCLOPS response element in the NIN promoter is important but not essential for infection thread formation during Lotus japonicus–Rhizobia symbiosis. Mol. Plant-Microbe Interact.35, 650–658 (2022). [DOI] [PubMed] [Google Scholar]
  • 59.Bailey, T. L. et al. MEME Suite: tools for motif discovery and searching. Nucleic Acids Res.37, 202–208 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinform.10, 1–9 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol.30, 772–780 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Nguyen, L. T., Schmidt, H. A., Von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol.32, 268–274 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Kalyaanamoorthy, S., Minh, B. Q., Wong, T. K. F., Von Haeseler, A. & Jermiin, L. S. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods14, 587–589 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Hoang, D. T., Chernomor, O., von Haeseler, A., Minh, B. Q. & Le, S. V. UFBoot2: improving the ultrafast bootstrap approximation. Mol. Biol. Evol.35, 518–522 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Handberg, K. & Stougaard, J. Lotus japonicus, an autogamous, diploid legume species for classical and molecular genetics. Plant J.2, 487–496 (1992). [Google Scholar]
  • 66.Małolepszy, A. et al. The LORE1 insertion mutant resource. Plant J.88, 306–317 (2016). [DOI] [PubMed] [Google Scholar]
  • 67.Gossmann, J. A., Markmann, K., Brachmann, A., Rose, L. E. & Parniske, M. Polymorphic infection and organogenesis patterns induced by a Rhizobium leguminosarum isolate from Lotus root nodules are determined by the host genotype. N. Phytol.196, 561–573 (2012). [DOI] [PubMed] [Google Scholar]
  • 68.Gamborg, O. L., Miller, R. A. & Ojima, K. Nutrient requirements of suspension cultures of soybean root cells. Exp. Cell Res.50, 151–158 (1968). [DOI] [PubMed] [Google Scholar]
  • 69.Charpentier, M. et al. Lotus japonicus Castor and Pollux are ion channels essential for perinuclear calcium spiking in legume root endosymbiosis. Plant Cell20, 3467–3479 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Hoagland, D. R. & Arnon, D. I. The Water-Culture Method for Growing Plants without Soil. Circular California Agricultural Experiment Station 347, 2nd edn (Univ. of California Berkeley, 1950).
  • 71.Maekawa, T. et al. Gibberellin controls the nodulation signaling pathway in Lotus japonicus. Plant J.58, 183–194 (2009). [DOI] [PubMed] [Google Scholar]
  • 72.Leong, J. M. et al. The Φ80 and P22 attachment sites. Primary structure and interaction with Escherichia coli integration host factor. J. Biol. Chem.260, 4468–4477 (1985). [PubMed] [Google Scholar]
  • 73.Liang, J. et al. A subcompatible rhizobium strain reveals infection duality in Lotus. J. Exp. Bot.70, 1903–1913 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Binder, A. et al. A modular plasmid assembly kit for multigene expression, gene silencing and silencing rescue in plants. PLoS ONE9, e88218 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Cerri, M. R. et al. Medicago truncatula ERN transcription factors: regulatory interplay with NSP1/NSP2 GRAS factors and expression dynamics throughout rhizobial infection. Plant Physiol.160, 2155–2172 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Takeda, N. et al. Gibberellins interfere with symbiosis signaling and gene expression and alter colonization by arbuscular mycorrhizal fungi in Lotus japonicus. Plant Physiol.167, 545–557 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Voinnet, O., Rivas, S., Mestre, P. & Baulcombe, D. An enhanced transient expression system in plants based on suppression of gene silencing by the p19 protein of tomato bushy stunt virus. Plant J.33, 949–956 (2003). [DOI] [PubMed] [Google Scholar]
  • 78.Jefferson, R. A. Assaying chimeric genes in plants: the GUS gene fusion system. Plant Mol. Biol. Report.5, 387–405 (1987). [Google Scholar]
  • 79.Eklund, A. & Trimble, J. beeswarm: the bee swarm plot, an alternative to stripchart. CRANhttp://CRAN.R-project.org/package=beeswarm (2021).
  • 80.de Mendiburu, F. Agricolae: statistical procedures for agricultural research. CRANhttps://cran.r-project.org/web/packages/agricolae/index.html (2023).

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information (4.3MB, pdf)

Supplementary Figs. 1–8.

Reporting summary (2.1MB, pdf)
Supplementary Table 1 (22.5KB, xlsx)

Summary of the bioinformatic analysis resulting in the discovery of PACE using 37 species.

Supplementary Table 2 (261.3KB, xlsx)

Results of the FIMO analysis of PACE in 163 species.

Supplementary Table 3 (11.9KB, xlsx)

Status of PACE and NIN in non-nodulating FaFaCuRo species. The species highlighted in grey are non-nodulating FaFaCuRo species that possess a full-length NIN open reading frame.

Supplementary Table 4 (20.7KB, xlsx)

Results of hairy root mediated complementation experiments of the L. japonicus Gifu nin-2 and nin-15 mutant lines with indicated constructs.

Supplementary Table 5 (25.7KB, xlsx)

List of plant genomes used for the search of conserved motifs within ERN1 and RAM1 promoters.

Supplementary Table 6 (12KB, xlsx)

List of seed bags, bacterial strains and incubation times.

Supplementary Table 7 (26.7KB, xlsx)

List of plasmids used.

Supplementary Table 8 (12.2KB, xlsx)

Sequences and IDs of oligonucleotides (DNA) used.

Supplementary Table 9 (11.2KB, xlsx)

Microscope/scanner settings and image analysis.

Data Availability Statement

Raw data corresponding to Fig. 1, Extended Data Fig. 1 and Supplementary Figs. 1–3 are available in Supplementary Tables 1–3 and 5. The remaining raw data are available upon request. Essential plasmids listed in Supplementary Table 7 can be ordered from the European Plasmid Repository (https://www.plasmids.eu/). References for the L. japonicus lines and M. loti strains are indicated in Supplementary Table 6.


Articles from Nature Plants are provided here courtesy of Nature Publishing Group

RESOURCES