Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Nov 4.
Published in final edited form as: Curr Biol. 2019 Oct 24;29(21):3681–3691.e5. doi: 10.1016/j.cub.2019.09.014

Attenuated Fgf signaling underlies the forelimb heterochrony in the emu Dromaius novaehollandiae

John J Young 1,3,§, Phil Grayson 2,4,§, Scott V Edwards 2, Clifford J Tabin 1,*
PMCID: PMC6834345  NIHMSID: NIHMS1539871  PMID: 31668620

Summary

Powered flight was fundamental to the establishment and radiation of birds. However, flight has been lost multiple times throughout avian evolution. Convergent losses of flight within the ratites (flightless palaeognaths, including the emu and ostrich) often coincide with reduced wings. Although there is a wealth of anatomical knowledge for several ratites, the genetic mechanisms causing these changes remain debated. Here we use a multidisciplinary approach employing embryological, genetic, and genomic techniques to interrogate the mechanisms underlying forelimb heterochrony in emu embryos. We show that the initiation of limb formation, an epithelial to mesenchymal transition (EMT) in the lateral plate mesoderm (LPM) and myoblast migration into the LPM, occur at equivalent stages in the emu and chick. However, the emu forelimb fails to subsequently proliferate. The unique emu forelimb expression of Nkx2.5, previously associated with diminished wing development, initiates after this stage, concomitant with myoblast migration into the LPM, and is therefore unlikely to cause this developmental delay. In contrast, RNA-sequencing of limb tissue reveals significantly lower Fgf10 expression in the emu forelimb. Artificially increasing Fgf10 expression in the emu LPM induces ectodermal Fgf8 expression and a limb bud. Analyzing open chromatin reveals differentially active regulatory elements near Fgf10 and Sall-1 in the emu wing and the Sall-1 enhancer activity is dependent on a likely Fgf-mediated Ets transcription factor-binding site. Taken together, our results suggest that regulatory changes result in lower expression of Fgf10 and a concomitant failure to express genes required for limb proliferation in the early emu wing bud.

Keywords: Loss-of-flight, powered-flight, Avian, Ratite, Paleognath, Emu, Fgf-signaling, Evolution, Limb, Development

eTOC Blurb

The flightless emu (Dromaius novaehollandiae) exhibits delayed and reduced forelimbs compared to other birds. By integrating comparative genomics and experimental embryology, Young et al. report that despite normal forelimb initiation, this developmental delay results from reduced proliferation in the mesenchyme due to altered regulation of Fgf10.

Introduction

Birds are one of the most diverse group of tetrapods [13]. Their defining feature, powered flight, has only evolved in two other tetrapod groups: bats, or chiropterans, and extinct pterosaurs. The evolution of powered flight required several modifications to the anatomy and physiology of ancestral ground-dwelling animals [4] [5]. Many of these modifications are thought to be metabolically expensive [6] and consequently, are often lost upon relaxation of selective pressures [7,8]. Accordingly, morphologies required for flight, such as large wings, have been lost as Sightlessness evolved in the ratites, a paraphyletic group of paleognathous birds [9,10], consisting of large cursorial birds such as the emu (Dromaius novaehollandiae), ostrich (Struthio camelus), extinct moa, and the diminutive kiwi [11]. Recent phylogenetic evidence indicates that flight has been lost multiple times within the paleognaths with only the tinamous (Tinaminformes) retaining it [10,12,13]. Consistent with these independent losses of flight, the forelimbs of the ratites display a variety of morphologies. The wing of the ostrich is reduced given its body size, but appears morphologically similar to the wing of the chicken and other volant, flight-capable, birds, whereas the wing of the emu and kiwi are diminutive and vestigial, and the extinct moa had no forelimbs at all [11,14].

The anatomical changes observed in the wings of adult ratites represent the end points of developmental trajectories that must, themselves, differ relative to other birds. Of the ratites that are available for developmental work, these changes are the most exaggerated in emu, due to the extreme vestigialization of the wing [15]. The emu’s hindlimbs are robust and resemble those of chickens and other volant birds early in development, whereas the forelimbs are disproportionately small and lack digits [16]. During embryogenesis, the forelimbs of the emu develop markedly later than in other birds [15], and amniotes in general (when compared at equivalent stages). Despite this delay, the forelimb does ultimately begin to grow, producing a limb that is appropriately patterned, albeit reduced in size [15,17].

The tetrapod limb bud develops first from the lateral plate mesoderm (LPM), the precursor population that produces the connective tissue of the limb [18,19]. Subsequently, muscle pioneers migrate into the LPM-derived early limb bud from the dermomyotome to produce the musculature [20]. To accomplish this, the early limb field requires precise spatial-temporal control over cellular proliferation, migration, and differentiation. The delay in emu forelimb development has been addressed in several previous studies and different hypotheses have been put forth to explain it, including: 1) a delay in the expression of the essential forelimb specific transcription factor Tbx5 leading to a delay in forelimb initiation [17]; 2) a co-option of the master heart regulatory gene Nkx2-5, which acts to repress forelimb development [21]; and 3) delayed expression of Shh, a gene involved in limb bud expansion as well as patterning [22]. While these hypotheses are supported by experimental evidence, none provides a comprehensive understanding of why the limb is initially delayed before developing into a fully patterned wing.

Here we revisit the problem, employing several embryological, molecular, and bioinformatics methods, to obtain a more definitive understanding of the alterations in the earliest stages of emu limb development that lead to the delay in forelimb formation, and thereby contribute to wing reduction in a flightless species.

Results

To address the cause for the heterochronic delay in emu wing bud formation, we first examined the precursor tissues that give rise to the limb bud. Limb mesenchyme is derived from lateral plate mesoderm (LPM), which starts as a single tissue layer underlying the future flank ectoderm. However, well before the formation of the limb bud, that initial LMP layer splits to form the somatopleure dorsally and the splanchnopleure ventrally [23]. Therefore, we used RNA-seq to compare the undifferentiated LPM (HH10) to the subsequent somatopleure and splanchnopleure (HH13) in each species. These comparisons via principal component analysis revealed clear separation of the three tissues in the chick (Figure 1A) but no such differences between the undifferentiated LPM and somatopleure in the emu (Figure 1B) suggesting a failure to fully commit to the somatopleure identity.

Figure 1. Forelimb of the emu and the chick show differences in the somatopleure following limb initiation.

Figure 1.

(A) Principal components analysis (PCA) plot of RNA-seq results from chick somatopleure, splanchnopleure, and undifferentiated lateral plate mesoderm. (B) PCA of RNA-seq results from emu somatopleure, splanchnopleure, and undifferentiated lateral plate mesoderm. See also Table S4.

These results indicated that the emu limb progenitors could differ from those of other birds at the very earliest stages and, thus the formation of the limb bud could be delayed at the start. The emu embryo produces an observable forelimb bud at HH20, three full developmental stages later than other birds. Accordingly, we asked whether the first initiating events of limb bud formation are delayed until this time point. Previous work has demonstrated that the earliest step in limb bud initiation is a Tbx5/Fgf10-dependent localized epithelial to mesenchymal transition (EMT) of the somatopleure portion of the LPM by HH15 to produce the mesenchymal precursors that will form the limb bud [24]. This EMT is marked by a loss of apically located filamentous actin and breakdown of the basement marked by laminin. Similar to the chick embryo, the presumptive forelimb somatopleure of the emu is epithelial at HH13 (Figure 2A) as marked by an intact basement membrane and apically localized filamentous actin. However, in the HH15 emu, as in the HH15 chick, the presumptive forelimb region has lost this epithelial organization as indicated by the loss of laminin and the disorganization of actin (Figure 2B). We therefore conclude that the emu forelimb is initiated at stages equivalent to those of other birds and not delayed in the earliest steps of limb development.

Figure 2. The emu forelimb undergoes initial limb formation at similar stages as the chick.

Figure 2.

(A) Emu HH13 embryo section at the presumptive forelimb level and stained with laminin (A’) and phalloidin (A”). (B) Emu HH15 embryo section at the presumptive forelimb level stained with laminin (B’) and phalloidin (B”). Arrowheads show intact basement membrane (A’) and apical localization of phalloidin in (A”) in the somatopleure. Arrowheads in show disintegrated basement membrane (B’) and disorganization of phalloidin in (B”) in the somatopleure. (C-D) Transverse sections showing Pax3 staining at HH14 in the emu. (C) Forelimb and (D) flank. (E-G) Transverse sections showing Pax3 staining at HH18 in the emu. (E) Forelimb, (F) flank, and (G) hindlimb. (H-J) Transverse sections showing Pax3 staining at HH21 in the emu. (H) Forelimb, (I) flank, and (J) hindlimb. Arrows show Pax3 positive cells in the LPM. (K-M) Transverse sections showing EDU staining in a HH18 emu. (K) Forelimb, (L) flank, and (M) hindlimb. (N-P) Quantification of EDU incorporation into the LPM of chick or emu embryos. Representative result of three independent experiments. (N) EDU quantification at HH16, (O) EDU quantification at HH18, and (P) EDU quantification at HH21. Scale bars 100 μ m.

Following EMT, the LPM-derived limb bud precursors provide signals that direct a second population of cells to migrate from the dermomyotome into the LPM where they will ultimately give rise to the limb muscles. In principle, as the LPM limb cells are delayed in growing into a limb bud, they could be quiescent at these early stages, with a resultant delay in myoblast migration into their midst. This hypothesis might be problematic however, as the source of these cells, the dermomyotome, no longer exists by the time the emu wing bud starts to form morphologically. Previous work has shown that the forelimb muscle pioneers enter the LPM at HH16 in chicken and subsequently proliferate and differentiate into myoblasts [25]. Pax3, a marker of these pioneer cells, was used to visualize muscle pioneers in emu embryos at HH14 (Figure 2CD), HH18 (Figure 2EG), and HH21 (Figure 2HJ) to determine when they migrate into the emu LPM. As expected, Pax3 positive cells were not observed in the LPM of HH14 embryos. However, by HH18, prior to visible forelimb outgrowth in the emu, muscle precursors are present in both the presumptive forelimb and the developing hindlimb but absent from the flank. By HH21, the observable muscle cells are clearly present in both developing limb buds. Taken together, these results suggest that the earliest steps of emu forelimb are not delayed, and that both the LPM- and dermomytone-derived precursor populations are formed in the emu embryo as in the chick. Yet it remains unclear why the limb precursor populations are stalled in the emu wing field at stages when they would be actively forming a limb bud in other birds.

The lack of evident outgrowth of the HH18 wing bud in the emu suggests that there is a transient failure in proliferation of the LPM-derived precursors. To directly test this hypothesis, chick and emu embryos were treated with the base analog EDU at stages HH16, HH18, and HH21. Staining for EDU at HH16 revealed similar, low levels of proliferation in the forelimb and flank regions of both species (Figure 2N). However, at HH18, the chick displayed notably higher EDU incorporation in the forelimb and hindlimb than in the flank; proliferation was similarly elevated at this stage in the emu hindlimb, yet the incorporation of EDU in the forelimb was at the same low level as seen in the flank (Figure 2KM, O). By HH21, once the emu has an observable forelimb bud, the forelimb and hindlimb of both species proliferate at higher levels than the flank (Figure 2P). These results suggest that downstream of the defect in LPM maturation in the forelimb region of the emu there must be key molecular differences leading to a failure in proliferation between stages 18 and 21.

To understand the molecular underpinnings of these observations we turned to next-generation sequencing to examine the differences in transcription and chromatin landscape between the volant chick embryo and the flightless emu. The somatopleure region of the LPM was dissected from the forelimb, flank, and hindlimb fields for each species at HH18 (Figure 3A) and subjected to RNA- and ATAC-sequencing. To circumvent the inherent difficulties of comparing transcriptomes across species, we first compared differences between the forelimb and hindlimb within species, subsequently comparing these differences to identify differentially expressed forelimb and hindlimb genes that were unique to either the chicken or the emu (Figure 3B). As expected, known markers of the forelimb and hindlimb were identified in this analysis. For example, Tbx5 and Tbx4 were significantly upregulated in the forelimb and hindlimb (Figure S1A), respectively, of both species. More importantly, several genes were found to be upregulated or downregulated specifically in the emu forelimb relative to the hindlimb (Table S1). Similar to a previous report, [21] Nkx2–5 was detected in our dataset as uniquely expressed in the emu forelimb. However, the expression was very low at HH18 (Figure 3B) and only detectable via in situ by HH19 (Figure S3A). Intriguingly, we found that several members of the Fgf signaling pathway were expressed at a significantly lower level in the emu forelimb, including both mesodermally expressed Fgfs (Fgf10, Fgf13, Fgf18) and ectodermally expressed Fgfs (Fgf4, Fgf8, and Fgf19: Figure 3B). Furthermore, a gene onotology analysis of lowly expressed factors specific to the emu forelimb showed an enrichment of Fgf signaling family members (Figure S1B). To confirm these results, we used quantitative qPCR, which confirmed low Fgf8 and Fgf10 expression in the emu forelimb (Figure S2AD).

Figure 3. The emu forelimb is deficient in Fgf-signaling molecule expression at HH18 when the limb is not proliferative.

Figure 3.

(A) Schematic demonstrating tissue collected from HH18 chick or emu embryos for RNA- and ATAC-sequencing. (B) Volcano plot showing differentially expressed genes between the forelimb and hind limb of the emu. Gray dots indicate genes with either < log2 fold change between tissues or were not significantly changed at p < 0.001. Red dots indicate genes with both > log2 fold expression change between the tissues and highly significant but show the same pattern in the chick. Blue dots indicate genes with both > log2 fold expression change between the tissues with high significance and show unique differences in the emu. (C-E) Whole-mount expression of limb marker genes in the HH18 chick. (C) Tbx5, (D) Fgf10, (E) Fgf8. (C’-E’) Transverse sections at the forelimb level of embryos shown in C-E. (F-H) Whole-mount expression of limb marker genes in the HH18 emu. (F) Tbx5, (G) Fgf10, (H) Fgf8. (F’-H’) Transverse sections at the forelimb level of embryos shown in F-H. Scale bars in C-H whole mount embryos 1mm. Scale bars in C’-H’ transverse sections 100 μ m. See also Figures S1S3 and Tables S1, S4.

We next visualized the expression of Tbx5, Fgf 0, and Fgf8 in both the chick and the emu at HH18. As expected, the chick showed broad Tbx5 (Figure 3C) and Fgf10 (Figure 3D) expression in the limb mesenchyme and Fgf8 (Figure 3E) expression in the overlying apical ectodermal ridge (AER). However, in contradiction to prior reports [17], Tbx5 (Figure 3F) was also clearly expressed in the nascent emu forelimb mesenchyme, as was Fgf10 (Figure 3G) though in a notably smaller domain than in the chick. It has previously been shown that Fgf10 and Tbx5 are required to mediate EMT of the limb-forming mesenchyme. Since we observed normal EMT in the emu forelimb region, it was not surprising to see expression of Tbx5 and Fgf10 in the presumptive forelimb field at HH18. Curiously, however, both Tbx5 and Fgf10 appeared to be ventrally restricted. Perhaps most intriguingly, there was no Fgf8 detected in the emu forelimb ectoderm, nor was there a discernable AER. Fgf8 expression was however, detected in the pronephros of the emu at HH18 (Figure 3H) indicating that the lack of Fgf8 signal in the ectoderm is not due to a failure to detect the transcript. However, by HH22 the expression patterns of Tbx5, Fgf10, and Fgf8 are the same in the chick (Figure S3CE) and the emu (Figure 3FH).

Even though Nkx2–5 is expressed at an extremely low level at HH18, one possible model tying these observations together would be that the initiation of ectopic Nkx2–5 expression in the emu forelimb downregulates Fgf-signaling in the LPM derived mesenchyme. Alternatively, Nkx2–5 could be expressed in a different cell type. In fact, Nkx2–5 expression does not overlap with Tbx5 expression in the emu forelimb (Figure S3A) but does overlap with Pax3 (Figure S3B). Moreover, when Nkx2–5 was overexpressed in the LPM of developing chick limbs there was no alteration in Tbx5 or Fgf10 expression in the forelimbs (Figure S3EH). These results suggest that the lack of overlapping Nkx2–5 and Tbx5 expression in the emu limb is not due to a direct regulatory relationship but rather is likely to be a consequence of their being expressed in different cell types.

Clues to the divergent expression patterns of these genes in the chick and emu could, however, be found by reference to the extensive existing literature on signaling in the limb. The forelimb field is initially marked by the expression of the t-box transcription factor TBX5, which activates the expression of the secreted molecule Fgf10 [26]. As noted above, TBX5 and FGF10 together induce the EMT that marks the first step in limb initiation [24]. FGF10 subsequently signals to the overlying ectoderm to induce expression of Fgf8 [27], which, in turn, signals back to the LPM to maintain more Fgf10 expression. This feedback loop between FGF10 and FGF8 results in their mutual maintenance while Fgf8 signaling ultimately drives the proliferation of the limb mesenchyme cells [28,29]. The lack of Fgf8 in the ectoderm of the emu forelimb provides an explanation as to why the emu forelimb exhibits relatively low cell proliferation at HH18. By HH21/22, as outgrowth of the emu forelimb bud commences, Tbx5 (Figure S3C,F), Fgf10 (Figure S3D,G), and Fgf8 (Figure S3E,H) are expressed in similar domains in the limbs of both species.

The lack of Fgf8 in the ectoderm could be due to either a failure of the mesodermal Fgf10 to induce Fgf8 or a failure of the ectoderm to respond to the Fgf10 signal. To test this, we transplanted chick forelimb LPM into host emu embryos. We found that transplantation of chick LPM into the emu resulted in the initiation of precocious limb bud outgrowth (Figure 4A) and, importantly, expression of Fgf8 in the host ectoderm (Figure 4B, Figure S3C). This result suggests that the emu ectoderm is responsive to signals from the LPM, but that those signals are insufficient to induce Fgf8, proliferation, or outgrowth prior to HH20. Consistent with the analysis above, Nkx2–5 expressing cells within the precocious limb were not labeled with GFP, further consistent with our conclusion that it is the myoblasts that express Nkx2–5 (Figure 3D). Given the restricted domain of Fgf10 expression in the emu, we hypothesized that Fgf10 levels might be too low to induce the positive feedback loop required for cell proliferation. To test this, we overexpressed Fgf10 in the emu (Figure 4CF) and found that this was sufficient to induce Fgf8 expression in the ectoderm and result in a precocial limb. These data suggest that the emu forelimb is induced and patterned at equivalent stages as other amniotes but fails to proliferate due to insufficient Fgf signaling to trigger the FGF10/FGF8 positive feedback loop.

Figure 4. The emu forelimb fails to grow due to insufficient Fgf10 expression in the mesoderm.

Figure 4.

(A) Dorsal view of emu embryo at HH18 host with donor GFP chick LPM. Representative of two independent experiments (B) Transverse section of embryo in A showing chick tissue labeled by green and emu expression of Fgf8 in the ectoderm. (C) Lateral view of untreated emu embryo stained for Fgf8. (D) Lateral view of HH18 emu embryo electroporated with an Fgf10 overexpression construct and stained for Fgf8. Representative of three independent experiments. (E) Dorsal view of embryo in E. (F) Transverse sections of HH18 emu embryo overexpression Fgf10. Red staining indicates electroporation tracer. Arrows point to regions of ectopic Fgf8 expression. Scale bars in A,C,D,E 1mm. Scale bars in B, F 100 μ m.

We next turned to ATAC-seq to gain insight into the cause of the differences in Fgf10 expression in the HH18 chick and emu forelimb buds. Chicken, a neognath, and emu, a palaeognath, split from their most recent common ancestor around 100–110 million years ago [10]. Given this extensive evolutionary divergence time, a set of unambiguous ATAC-seq peaks was generated for each species once peaks had been called and lifted over to the chicken as described in the methods. These peak regions either: (i) contained a peak in all three biological replicates for a given tissue (strict dataset), or (ii) contained no peak in all three biological replicates for a given tissue. Using this filtering strategy to identify consistently open or closed regions of chromatin corresponding to putatively active or inactive regulatory elements within the forelimb, flank, and hindlimb of the HH18 chicken and emu, we obtained a final set of 27,669 ATAC-seq peaks with 6,197 shared between at least one tissue for chicken and emu (Figure 5A).

Figure 5. ATAC-seq results identify putative enhancers with differential accessibility in the chicken and emu lateral plate mesoderm at HH18.

Figure 5.

(A) Top: Distribution of non-ambiguous ATAC-seq peaks between all tissues (forelimb, flank, and hindlimb) of chicken and emu at HH18. Of the 20,507 and 13,359 non-ambiguous peaks in chicken and emu respectively, 6,197 are shared between the species in at least one tissue. Bottom: Distribution of non-ambiguous ATAC-seq peaks within the tissues of each species indicating that the majority of all non-ambiguous peaks are shared in both the limb and flank regions. (B) Matrix representation of presence/absence of a subset of all non-ambiguous peaks in Figure 5A broken into 4 classes of interest (I-IV). I: Shared or species specific ATAC-seq peaks found in all tissues for a given species. For example, of the 6,197 peaks shared between emu and chick from Figure 5A Top, 5,563 are shared across all tissues. II: Shared forelimb and hindlimb peaks are the most common among chicken-specific peaks found in two tissues. III: Shared forelimb and flank peaks are the most common among emu-specific peaks found in two tissues. IV: 59 ATAC-seq peaks fall into the category of emu forelimb missing, open in proliferating limbs (emu hindlimb, chicken forelimb, and hindlimb), but closed in the less proliferative flanks and the emu forelimb, while 0 peaks exist if we change to filter on both chicken limbs and the emu forelimb. Additionally, there are 329 peaks found only in the emu forelimb region and nowhere else. C: ATAC-seq data for a differentially accessible putative enhancer upstream of FGF10. This peak is classified as emu forelimb missing, given its presence in proliferating limbs and absence in less proliferative tissues including the flank. D: ATAC-seq data for a second emu forelimb missing putative enhancer upstream of SALL1. See also Figure S6 and Tables S2S4.

By filtering peak regions into different classes, compelling candidate regions for species-specific chromatin differences in early forelimb development emerged. For example, class II and III ATAC-seq patterns in Figure 5B demonstrate that when filtering for unambiguous species-specific signals, the chicken forelimb and hindlimb peaks are most commonly found together, as opposed to either limb sharing a peak region with the flank, but that in emu this pattern flips: the most common sharing of ATAC-seq peaks is between the forelimb and flank, the two tissues growing at a lower rate. To identify peaks that might be relevant to the regulation of the proliferation of limb mesenchyme, we focused on ATAC-seq peaks present in the emu hindlimb, and both chicken limbs but absent from the emu forelimb and the flank somatopleure of both species. We thus filtered our ATAC-seq peaks from the total peak dataset to include the subset of peaks that appear in the chicken strict dataset for forelimb and hindlimb, and the emu strict dataset for hindlimb, while also being absent from the all strict flank datasets and the strict emu forelimb peaks (Table S2). To make this filter even more stringent, we also asked that the peak be absent from all six individual flank libraries, the two flank pool libraries as well as the three emu forelimb individual libraries and one emu forelimb pool dataset (class IV of Figure 5B). This filtering resulted in a list of 59 genomic regions (Table S2) displaying open chromatin in all highly proliferative limb buds – the emu hindlimb and the chicken fore- and hindlimb, while exhibiting closed chromatin in the less proliferative emu forelimb and flank tissues of both species. To assess whether this small number of peaks could be a simple artifact of the large number of peaks called in both genomes, we filtered for the same chicken pattern - open in both limbs and closed in flank - while swapping the filtering on the emu limbs (open in forelimb, closed in hindlimb and flank). This filtering strategy resulted in 0 peaks, suggesting that emu forelimb missing regions are likely biologically relevant to the phenotype. Finally, the 329 genomic regions containing a peak only in the emu forelimb were also considered as candidates. These candidate regions were then linked to their nearest coding gene and designated as “top priority” if the gene was differentially expressed in the emu forelimb compared to hindlimb but not differentially expressed in the chicken forelimb compared to hindlimb. This strategy uncovered 12 compelling candidate regions in the emu forelimb missing dataset, 10 of which also overlapped the previously published active ChIP-seq datasets [30], for H3K27ac or H3K4me1, further strengthening the hypothesis that these elements possessed enhancer activity (Table S3). The top candidates from this analysis were found to neighbor Fgf10 (Figure 5C) and Sall-1 (Figure 5D). Fgf10 was a strong candidate based on its known key role in limb patterning and on our previous analysis (above). Sall-1 is also a known patterning gene, expressed in the limb bud mesenchyme downstream of Wnt and Fgf signaling [31]. Further, our RNA-seq data showed Sall-1 to be one of the most differentially expressed genes, active in the emu hindlimb but not forelimb at HH18 (−log2Fold change of 2.21 and padj of 8.8e20), whereas it is not differentially expressed between the chicken fore- and hindlimbs (−log2FoldChange −0.17 and padj 0.98). Accordingly, we hypothesized that differential chromatin accessibility at these regions results in, or at least contributes to, the observed low (but not absent) Fgf10 expression present in the emu forelimb at HH18 and that restoration of Fgf signaling at later stages acts through enhancers in this region. To that end, the regions containing the peaks identified by ATAC-seq upstream of Sall-1 and Fgf10 were chosen to test for enhancer activity. Regions selected for enhancer analysis were cloned upstream of a minimal β-actin promoter driving GFP. Since our analyses were conducted on forelimb tissue, we electroporated constructs carrying putative enhancers into the primordium of the chick wing. In doing so, the enhancer nearest to Fgf10 failed to drive expression but the Sall-1 enhancer drove gfp expression in the limb bud robustly by HH18 (Figure 6A) and continued to drive expression through HH22 (Figure 6C).

Figure 6. The ATAC-seq peak neighboring Sall-1 has enhancer function in the developing chicken limb, which is initially dependent on an intact Ets transcription factor binding site.

Figure 6.

(A) HH18 chick embryo electroporated with a tracer (red) and enhancer reporter containing the Sall-1 peak (green), representative of five separate experiments. (B) HH18 chick embryo electroporated with a tracer (red) and enhancer reporter containing the Sall-1 peak but with a mutated Ets site (green), representative of three separate experiments. (C) HH22 chick embryo electroporated with a tracer (red) and enhancer reporter containing the Sall-1 peak (green), representative of four separate experiments. (D) HH22 chick embryo electroporated with a tracer (red) and enhancer reporter containing the sall-1 peak but with a mutated Ets site (green), representative of three separate experiments. Scale bars in A-D 1mm. Scale bars in A’-D’ 100 μ m. Table 1. Results from PIQ foot-printing analysis. PIQ-predicted differential TF binding events within the differentially accessible chromatin region neighboring Sall-1. Footprints scored as binary presence (1) or absence (0). Asterisks indicate Ets family transcription factors. See also Figures S4 and S5.

Given our finding that Fgf10 overexpression induces a precocial limb in the emu (Figure 4C), we looked for potential Fgf response elements in the Sall-1 enhancer. Fgf signaling in the limb is known to relay through the MAPK pathway, culminating in the activation of Ets family transcription factors [32]. A Transfac analysis predicted an Ets family transcription factor-binding motif in the center of the putative Sall-1 enhancer (Figure S4B). To validate this prediction and to test for differential transcription factor binding at that site, footprinting was carried out with the Protein Interaction Quantification Program (PIQ – [33]). When PIQ-predicted transcription factor (TF) binding events were filtered as carried out for the ATAC-seq data – looking for TF footprints that were present consistently within the proliferative emu hindlimb and both chick limbs, while also missing in the lowly proliferating flanks and the emu forelimb – 8 differentially footprinted TFs were identified (Table 1). Of these 8 factors, 3 of them are members of the Ets transcription factor family and were identified by PIQ to bind starting at base NC_006098.3:5333038 in galGal4, which corresponds to the Transfac predicted binding sites for Ets family members. Also, in this pileup of Transfac predicted binding sites is a motif for Tbx5, but this is not recovered as footprinted, differential or otherwise, in the PIQ analysis (Figure S4A).

Table 1: PIQ Foot-printing analysis results.

PIQ-predicted differential transcription factor binding events with in the differentially accessible chromatin region neighboring Sall-1. Footprints scored as binary presence (1) or absence (0). Asterisks indicate Ets family transcription factors.

Chicken Emu
pwmTF Hindlimb Forelimb Flank Hindlimb Forelimb Flank
ELK4* 1 1 0 1 0 0
ERG* 1 1 0 1 0 0
FLI1* 1 1 0 1 0 0
HOXC13 1 1 0 1 0 0
INSM1 1 1 0 1 0 0
MIxip 1 1 0 1 0 0
NR2F1 1 1 0 1 0 0
NR2F1 1 1 0 1 0 0

The necessity of this site for driving gene expression was tested by mutating the core Ets motif while maintaining the neighboring motifs, including that for Tbx5, followed by electroporation of the modified region into the developing chick wing (Figure S5AC). Mutation of the Ets motif completely abrogated enhancer activity at HH18 (Figure 6B) suggesting that this site is required for activity at this early time point. However, by HH22 the mutated enhancer gained activity and was competent to drive gfp expression (Figure 6D). A similar pattern was observed when we electroporated the wildtype chicken putative Sall-1 enhancer into the developing emu forelimb field; at HH18, there is no gfp activity (Figure S5AB), but as the limb begins to bud out from the body wall, likely in response to increased Fgf10 and Fgf8 expression, gfp is present at HH22 (Figure S5CD). Taken together, our results provide a mechanism where diminished mesodermal Fgf expression in the emu forelimb results in a temporary failure to induce limb-specific genes and a subsequent reduction in proliferation.

Discussion

The small wing of the emu is a defining characteristic of this flightless bird. Accordingly, much attention has been devoted to how limb development has been altered in the emu to generate such a vestigial wing. Here, through the use of genomic and experimental methods, we demonstrate that attenuated Fgf signaling with a consequent decrease in early mesenchyme proliferation contributes to this phenotype through a delay in limb bud development. Conversely, we find that the very earliest steps in wing development, the induction of the wing and muscle precursor migration, occur at stereotypical stages. Thus, the heterochrony in wing bud outgrowth in the emu is accomplished through pausing after the initial events of limb development have been established. On a molecular level, this work points to a model whereby the emu wing is induced via TBX5 and sufficient FGF10 to trigger EMT but insufficient FGF10 to induce ectodermal Fgf8 expression. This in turn, results in a lower proliferation of the limb mesenchyme and failure of the HH18 emu forelimb to emerge from the body wall. Proliferation is, however, induced, and outgrowth initiated, following accumulation of FGF10 to a level that promotes Fgf8 expression.

We found reduced chromatin accessibility near the Fgf10 locus specifically in the emu forelimb. This region overlaps a previously published HH21 H3K27ac ChIP-seq peak, suggesting enhancer activity that would be a likely cause for the reduced Fgf10 expression observed in our RNA-seq data. Further, we identified an enhancer near the limb factor Sall-1 that requires an intact Ets transcription factor-binding site to drive expression at nascent limb bud stages. This suggests a model where in the evolution of loss of flight in the emu, changes in the regulation of Fgf10 resulted in lower forelimb expression and a subsequent delay in both proliferation and limb gene expression. Our experiments cannot discriminate between a model where changes in the trans environment result in reduced access to the Fgf10 locus or cis-changes in the Fgf10 locus result in a failure of trans factors to bind. However, both of these models provide an explanation for how the emu alters its forelimb by reduction of Fgf10 without affecting the development of the hindlimb.

It is curious that the genomic regions around Fgf10 identified in our screen for differentially accessible chromatin failed to drive reporter gene expression in our assay; however, several hypotheses could explain this result. It is possible that the regions we identify are necessary for native enhancer activity but alone are not sufficient. There is precedent for such a model. For example, Gli response elements in the Ptch1 locus contribute to a limb regulatory region that is responsible for increased posterior Ptch1 expression in response to Shh. However, testing of these isolated elements fail to recapitulate Ptch1 expression domain [34]. Similarly, the regions we identify as differentially accessible may act as part of a larger regulatory landscape that together drives expression. Alternatively, despite the lack of repressive ChIP-seq marks in previously published data, it is possible that some of the regions we identified could be binding sites for transcriptional repressors, which would not activate gfp in our assays. Precise control via interplay of both transcriptional activation and repression is required to ensure proper spatiotemporal gene expression. It also remains possible that our assay is either not sensitive enough or plasmid expression is not regulated in the same fashion as the endogenous locus.

Recently, Nkx2–5 was discovered to be highly expressed in the forelimb of HH21 emus and it was suggested that this ectopic expression could be key to the development of the diminutive wing. Indeed, viral overexpression of Nkx2–5 in the chick does result in a smaller wing [21]. While we found that Nkx2–5 is detectable in the emu forelimb as early as HH18 (Figure 3B), its expression at this stage is very low and below significance thresholds. Given that the emu forelimb is already delayed by HH18, the ectopic expression of Nkx2–5 is unlikely to be central to the delay in limb bud outgrowth. By HH19, Nkx2–5 expression is more prominent, showing a complementary domain to the forelimb gene Tbx5 (Figure S2A), but an overlapping domain with the muscle precursor marker Pax3 (Figure S2B). Consistent with this, we, and at later stages, others [22], found no differences in the timing of muscle pioneer migration between the chick and the emu, and by HH30, Nkx2–5 expression is found in two distinct dorsal and ventral domains [21], resembling the domains of the dorsal and ventral muscle masses. Transplantation of chick LPM into the emu forelimb field produces an earlier limb bud in the emu but does not prevent ectopic Nkx2–5 expression (Figure S3D). Additionally, overexpression of Nkx2–5 in the chick LPM has no effect on the timing of limb bud outgrowth, further supporting the conclusion that Nkx2–5 does not play a role in the emu forelimb heterochrony. Our data, taken alongside these previous observations, strongly suggest that ectopic Nkx2–5 expression in the emu forelimb is specific to the myoblasts, which migrate into the LPM, and that the presence of Nkx2–5 cells in the developing emu forelimb bud does not have an inhibitory effect on outgrowth. Given the interactions of the muscles with the skeletal elements of the limb [25], altered expression of Nkx2–5 in the muscle could result directly or indirectly in the skeletal abnormalities observed following Nkx2–5 overexpression [21] without having any effects on the timing of limb development.

A delay in Tbx5 expression has also been previously linked to the delay in emu forelimb development, but both our in situ and RNA-seq data place Tbx5 in the forelimb of emu at HH18, a result also corroborated by in situ of HH19 emus published by Farlie et al [21]. A third, previously proposed explanation for the small emu wing is that a delay in Shh expression causes limb bud heterochrony [22]. Although Shh and Ptch1/2 are expressed at a lower level in the HH18 emu forelimb, oligozeugodactyly chick mutants with no expression of Shh in the limb initiate limb bud outgrowth indistinguishable from wild type embryos [35]. Taken together, a role for attenuated FGF signaling remains the most likely explanation for the delayed limb bud outgrowth observed in the emu.

Differential Fgf signaling between the forelimb and hindlimb has been observed in other species. Marsupials develop and differentiate their forelimbs before their hindlimbs as an adaption to facilitate travel into the pouch of the mother following birth. Accordingly, the forelimb of the gray short-tailed possum (Monodelphis domestica) expresses Shh, Fgf10, and Fgf8 early to induce its proliferation [36,37]. Together with our results, this suggests that plasticity in the timing and amount of Fgf signaling is a recurring mechanism to alter limb development in evolution. Yet, despite a recurring role for Fgf in timing of limb development, it remains to be seen if it could be a common mechanism in the loss of flight in birds. Ratites have undergone multiple independent losses of flight throughout their evolution [9,10,38]. Accordingly, from the complete absence of wings in moa through the diminutive wings of emus and kiwis to the larger wings of ostriches and rheas, there is a broad diversity in wing morphology within this group. Our finding of altered Fgf10 expression is likely a secondary change, once the full development of the wing became expendable following loss of flight in the emu/cassowary clade. Finally, our interdisciplinary results provide a rich resource for the further study of how alterations in genomes result in phenotypic changes.

STAR Methods

Lead contact and materials availability

Further information and requests for resources such as recombinant DNA plasmids generated in this study should be directed to and will be fulfilled by the Lead Contact, Clifford J. Tabin (tabin@genetics.med.harvard.edu).

Experimental model and subject details

Avian Embryos

White leghorn chicken embryos were obtained from Charles River (MA) and emu embryos were obtained from Floeck’s Country Farms (Tucumcari, NM). Roslin GFP eggs [48] were obtained from Susan Chapman at Clemson University. Staging of chick embryos was carried out according to HH staging series [49] and emus were staged according to [15]. Chicken embryos were incubated at 38°C and emu embryos were incubated at 35°C until they reached appropriate stages (details found in the Methods detail section).

Animals

All experiments and animal care were carried out according to institutional IACUC guidelines at Harvard Medical School.

Method Details

Electroporation and microsurgery

Coeloms of chick or emu embryos at HH13 and HH14 were injected with DNA mixtures and electroporated using modifications to standard procedures. Briefly, three 1.0 milisecond 50V pulses were applied to injected embryos using custom electrodes (Bulldog Bio) [50]. Enhancer screens were carried out using a β-actin GFP vector [39,40], with the plasmid kindly provided by Mikiko Tanaka. In situ hybridization was carried out according to [51]. Transplants were carried out using fine watch maker’s forceps and flame polished tungsten needles.

Histology and EDU incorporation assays

Chick and emu embryos were windowed at HH18 and 100μL of 20mM EDU was injected between the embryo and the yolk. Embryos were incubated for 30 minutes following injection and then fixed in 4% formaldehyde and imbedded in 7.5% gelatin and 7.5% sucrose. EDU was visualized using the Click-It® system (ThermoFisher). Anti-laminin (Sigma L9393) was used at 1:50, phalloidin (ThermoFisher A12381) was used at 1:100, anti-Pax3 (R&D Systems AF2457) was used at 1:50), anti-GFP (Abcam Ab6556) and anti-RFP (Sigma SAB2702214) were used at 1:500.

In situ hybridization probe generation and quantitative PCR

In situ probes for emu transcripts were generated using transcript specific PCR primers with added T7 sequences to synthesize anti-sense DIG-labelled RNA probes. All PCR probe templates were cloned into pGEM-T easy (Promega) and sequence confirmed. Embryos for whole mount in situ hybridization were permeabilized with 10μg/ml of proteinase K for 20 min, refixed in 4%formaldehyde and then hybridized overnight at 70°C. Unbound probe was washed out with SSC washes and the embryos were then blocked in Blocking Reagent (Millipore) and incubated with anti-DIG-AP antibodies overnight (1:2000). Unbound antibodies were washed out with TBST washes and bound probe was visualized with BM purple (Sigma). Embryos for section in situs we embedded in 50% sucrose/50% OCT and 14μM sections were cut. The tissue was permeabilized with 1 μg/mL proteinase K for 20 min, re fixed in 4% formaldehyde and hybridized overnight at 60°C. Unbound probe was washed out with SSC washes and the embryos were then blocked in Blocking Reagent (Millipore) and incubated with anti-DIG-POD or anti-fluorescein-POD antibodies overnight (1:300). Unbound antibodies were washed out with TBST washes and bound probe was visualized with Cy3 or fluorescein TSA kit reagents (Perkin Elmer), respectively. For qPCR, total RNA was collected from dissected forelimbs and hindlimbs of HH18 chick and emu embryos. cDNA was synthesized using iScript (BIO-RAD) and amplified on a CFX96 light-cycler (BIO-RAD) using SYBR Green Supermix (BIO-RAD). All primer sets were sequence confirmed and tested for amplification linearity using serial dilutions of cDNA and resulted in similar CT values. The same amount of input cDNA was used for cross-species comparisons. All primer sequences used in this work can be found in Table S5

RNA-Seq library preparation

For HH18 emu and chick embryos, total RNA was harvested from 4 replicates each of LPM from forelimb, flank and hindlimb regions (somites 15–20, 21–25, and 26–30 in chicken and 21–25, 26–34, and 35–40 in emu) via lysis in Trizol followed by isolation using RNeasy micro columns (QIAGEN). 3 replicates of flank tissue were collected from the emu. Libraries were synthesized from 200ng total RNA using the stranded TruSeq kit (Illumina) followed by 15 cycles of amplification. Libraries were pooled and single-end 75bp reads were sequenced on a NextSeq High 75 flowcell (Illumina). Sequencing generated average reads per replicate in the chick of 32.7, 52.9, and 29.1 million reads for the forelimb, flank, and hindlimb libraries, respectively. The emu libraries generated average reads per replicate of 28.5, 20.7, and 27.8 million reads for the forelimb, flank, and hindlimb libraries, respectively. For earlier forelimb LPM (HH10), somatopleure and splanchnopleure (HH13) samples, total RNA was harvested from three replicates of each tissue using PicoPure columns (Arcturus) and 200–300bp insert libraries were synthesized from 50ng of total RNA by automation using Kappa reagents followed by 15 cycles of replication. Two replicates of emu splanchnopleure were collected. Libraries were sequenced as paired-end 37bp reads on a NextSeq High 75 (Illumina). This generated average reads per replicate in the chick of 14.8, 16.1, and 14.5 million reads for the LPM, somatopleure, and splanchnopleure libraries, respectively. The emu libraries generated average reads per replicate of 18.6, 16.7, and 17.5 million reads for the forelimb, flank, and hindlimb libraries, respectively.

ATAC-seq library preparation

ATAC-seq library preparation was carried out according to Buenrostro et al. (2015)[52] with minor modifications. Three biological replicates were generated for each of the following tissues from chicken and emu: HH18 forelimb somatopleure, HH18 flank somatopleure, and HH18 hindlimb somatopleure. These tissues were chosen to represent proliferating and non-proliferating mesoderm and overlying ectoderm at the first stage where limb development differs dramatically between chicken and emu (following EMT and before the emu forelimb begins to bud from the body wall). Single cell suspensions were generated as follows: tissue was dissected from each embryo in cold PBS. Tissue was immediately transferred to lx Trypsin in EDTA Solution (Sigma) for 10–15 minutes at room temperature. Following trypsinization, the tissue was transferred to a neutralizing culture media (DMEM (Gibco) with 10% FBS (Gibco) and 1% Pen Strep (Gibco)) and pipette-mixed until homogenized. Homogenate was filtered through 35μm nylon mesh filters and filtrate was taken immediately to cell counting.

Cells were counted on an inverted light microscope using a hemocytometer. Cells to be counted were diluted in Trypan Blue (Sigma) to distinguish living cells from dead. Approximately 50,000 live cells were isolated and utilized for each library. Nucleus isolation, transposition, and reaction cleanup were carried out with no modifications. Briefly, transposition utilized the Tn5 transposase and buffer from the Nextera DNA Library Preparation Kit (Illumina). Following transposition reaction cleanup, all libraries were PCR amplified for 11 cycles. PCR reactions were cleaned using the MinElute PCR Purification Kit (Qiagen). Libraries were visualized on a HS DNA Bioanalyzer Chip (Agilent) and quantified using a Library Quantification Kit (KAPA Biosystems) for multiplexing. ATAC-seq libraries were pooled and paired-end 75bp reads were sequenced on a NextSeq High 150 flowcell (Illumina). The average read depth per each replicate across all tissues from both species was 38.6 million reads. For the chicken, an average of 38.1, 36.1, and 43.6 million reads were sequenced for the forelimb, hindlimb and flank libraries respectively. For the emu, an average of 34.1, 37.9, and 41.9 million reads were sequenced for the forelimb, hindlimb and flank libraries respectively.

Quantification and statistical analysis

RNA-seq analyses

Calling of single copy orthologs is the first step for multi-species RNA-seq analyses. Here, we utilized an in-house emu assembly that had been annotated using MAKER (see Sackton et al. 2019, [38]) and the NCBI galGal4 genome. Subsequently, the emu assembly was submitted to NCBI and is now available as droNov1 here: (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/003/342/905/GCF_003342905.1_droNov1). A common means of calling orthologs between one species with an annotated genome and another species with an unannotated genome is using BLAST [21] but for our approach we took advantage of the in-house resources associated with the emu genome to link the emu transcript models generated by MAKER to named genes in the galGal4 assembly. The whole genome alignment can be added to My Hubs on UCSC (https://genome.ucsc.edu/cgi-bin/hgHubConnect) with the following link (https://ifx.rc.fas.harvard.edu/pub/ratiteHub8/hub.txt).

In order to obtain robust single copy orthologs between chicken and emu, two approaches were utilized: liftover (LO) and homologous orthologous groups (HOGs). First, for the LO approach, genomic regions with homology to chicken exons were identified in emu through halLiftover [47] using an in-house whole genome alignment (WGA) from Sackton et al. 2019. These genomic intervals in emu were then intersected with the genomic locations of predicted gene models from MAKER for the emu using bedtools [53] [45]. Finally, emu genes were assigned to a given chicken gene if the exons from chicken lifted over to a single emu MAKER gene model and if that emu MAKER gene model did not intersect with more than one chicken gene. This final filtering step was carried out using a custom python script (emuExonLO_intersect_parser_Annotated2018.py). The HOG approach utilized homologous orthologous groups first identified using the program OMA [54] in Sackton et al. 2019. An individual HOG is associated to a set of genes that arise from a common ancestor and can contain anything from a single gene to an entire gene family across various species. Concurrent to the present manuscript, RNA-seq is being conducted by PG on emu, chick, and three additional paleognaths for a separate project. In calling orthology across these five species, a pipeline was generated to identify orthologs from individual HOGs and the final emu-chicken single-copy orthologs were taken from that dataset for this manuscript. Briefly, for each HOG identified in Sackton et al. 2019, it was determined if the HOG had a single entry for each of the four paleognaths and chicken. If a given HOG only had one gene listed for each species, these genes were called orthologs. If a HOG contained multiple entries for any of the five species, the coding sequences of all genes for the 4 palaeognaths and chicken were aligned using RaxML [55] and the resulting trees were parsed using an in-house R script (APE.R). If five-species monophyletic clades existed in the resulting trees, these clades were pruned from the tree and counted as single-copy orthologous genes within these five species (and therefore in chicken and emu for the purposes of this manuscript). These three lists, namely: LO, single HOGs, and duplicate HOGs (which represent HOGs containing gene families), were compiled using an in-house R script (Emu_Chick_hh18_Oct_update.R). Of the 7392 genes that were present in both a LO and HOG gene list, only 8 represented mismatches between the two lists. This high level of congruence between the lists provides us with high levels of confidence in both independent approaches to calling orthologs across this evolutionary distance. Due to the complexities associated with whole genome alignment, and the possibility of overlapping chicken genes on the forward and reverse strands, any chicken genes that disagreed between the two lists (n=8) were visually inspected in the genome browser and the HOG entry for these genes was maintained, while the LO entry was deleted. Following this filtering step, any emu genes that had been linked to more than one chicken gene had all copies removed. This final filtering step resulted in a list of 12,315 single-copy orthologs between chicken and emu. RNA-seq reads were mapped using STAR [41] to reference genomes (galGal4 or droNov1) followed by quantification via FeatureCounts [42]. Differential gene expression was determined using DESeq2 [43].

ATAC-seq analyses

Raw ATAC-seq reads were trimmed using NGmerge (https://github.com/jsh58/NGmerge) and mapped to the galGal4 or droNovl genomes using Bowtie2 with the–X 2000 option [44]. Following mapping, duplicates were removed with Picard (https://broadinstitute.github.io/picard/) and mitochondrial reads were removed with removeChrom (https://github.com/jsh58/harvard/blob/master/removeChrom.py). MACS2 was utilized for peak calling using the following pipeline to identify consistent peaks between biological replicates for a given tissue: (1) each individual library (n=18) was passed through MACS2 [56](https://github.com/taoliu/MACS) with a relaxed significance threshold (p-value < 0.05), (2) the biological replicates (n=3) for each tissue (n=8) were pooled together and passed through MACS2 with a stringent significance threshold (q-value < 0.05), (3) peak boundaries were defined by the peaks called for the pooled dataset, (4) bedtools intersect and bedtools annotate were utilized to identify pooled peaks (from step 2) that overlapped with peaks called individually for the three biological replicates in step 1, (5) a peak was only considered significant for a given tissue if it was called in the pooled dataset (q-value < 0.05), overlapped peaks in all three biological replicates, and also possessed a stringent significance value (q-value < 0.05) across all three individual biological replicates (as identified in step 1). It has previously been shown that ATAC-seq data is highly enriched at transcription start sites (TSSs). Using the Genome Association Tester (GAT), we determined that our libraries were of high quality based on their overlap with TSSs in both the chicken and emu genomes [46] (Figure S6).

To identify homologous genomic regions containing peaks between chicken and emu, a previously generated whole-genome alignment was utilized (Sackton et al. 2019). Strict peak files from emu were used as the query for halLiftover against chicken [47]. Emu liftover peaks (regions from the emu genome that contained ATAC-seq peaks and adequate levels of conservation to be assigned homology to the chicken genome) were then filtered to not include any inserts larger than the largest ATAC-seq peak in the chicken dataset (~10 kb). Following that filtering step, liftover peaks were sorted with bedtools sort and merged using the -d 1 option in bedtools merge. Finally, small peaks (<50 bp) were also removed from emu liftover peaks to generate the final emu liftover ATAC-seq peak set.

To generate a total peak dataset, strict chicken ATAC-seq peaks and strict emu liftover ATAC-seq peaks were concatenated, sorted and merged as above. This generated a single set of bed intervals referenced to the chicken genome for all peaks across all tissues such that any base contained within a strict chicken peak or strict emu liftover peak from any tissue would be represented in the total peak set. For example, if a chicken forelimb peak was called from bases 100–250 and an emu hindlimb peak was called after liftover from base 249–500 on a given chromosome, a single bed interval would be generated in the total peak set from base 100–500 to avoid having the same bed intervals represented numerous times in the total peak dataset. This total peak dataset was then used as a backbone onto which the ATAC-seq peak coordinates from the strict chicken peaks, strict emu liftover peaks, individual and pooled chicken peaks, individual and pooled emu liftover ATAC-seq peaks, and previously-published chicken ChIP-seq [30] datasets where mapped using bedtools annotate. This bedtools annotate call generated a large dataframe with total peak dataset bed intervals, or peaks, as the rows and libraries, peak sets, or targets of interest as the columns. The data within the data frame are values between 0 and 1 that represent the proportion of bases a given peak interval is covered by the peak dataset represented in each column. For example, a peak that was found only in the emu flank tissue would have a value of 1 (and nearly 1) in the emu flank peak columns and a value of 0 for all other columns. Finally, these total peak dataset bed intervals were assigned to their nearest gene in the chicken genome using bedtools closest. By assigning peaks the nearest neighboring gene, it was possible to identify comparative differences across both ATAC- and RNA-seq data.

PIQ was implemented for differential footprinting across the ATAC-seq libraries. PIQ utilizes machine learning to determine binding sites for TFs based both on their binding motif (DNA sequence) and their “footprint”, which corresponds to the altered localized DNA cutting in DNase I hypersensitivity and ATAC-seq experiments resulting from steric hindrance surrounding a bound TF. PIQ was run against the chicken and emu genomes for around 975 transcription factors with motifs available from the JASPAR CORE and JASPAR VERTS motif databases. Following this run, binding events were assessed in the emu and chicken in the forelimb, flank, and hindlimb ATAC-seq datasets. Within the homologous Sall-1 enhancer region between chicken and emu, binding events were identified for all TFs and were considered bound if they possessed a positive PIQ score and a PIQ purity value over 0.7.

Data and code availability

The datasets and code utilized in this study are available at GEO (accession TBD at https://www.ncbi.nlm.nih.gov/geo/) and on GitHub at https://github.com/philgrayson/young_grayson_CB2019

Supplementary Material

1

KEY RESOURCES TABLE

REAGENT or RESOURCE SOURCE IDENTIFIER
Antibodies
Rabbit polyclonal anti-Laminin Sigma L9393
Goat polyclonal anti-Pax3 R&D Systems AF2457
Rabbit polyclonal anti-GFP Abcam Ab6556
Mouse monoclonal anti-RFP Sigma SAB2702214
Anti-DIG-AP Roche 11093274910
Anti-DIG-POD Sigma-Aldrich 11207733910
Anti-Fluorescein-POD Sigma-Aldrich 11426346910
Bacterial and Virus Strains
Chemicals
Phalloidin ThermoFisher A12381
Trypsin in EDTA Sigma T3924
DMEM Gibco 11960044
Pen Strep Gibco 15240062
FBS Gibco 16000044
IQ SYBR Green Supermix BIO-RAD 1708880
iScript BIO-RAD 1798890
Proteinase K NEB P8107S
Blocking Reagent Millipore-Sigma 11096176001
TSA Plus Cy3 and Fluorescein Perkin-Elmer NEL753001KT
OCT VWR 25608–930
BM-Purple Sigma-Aldrich 11442074001
Critical Commercial Assays
Click-It® EDU Kit ThermoFisher C10338
TruSeq stranded mRNA Kit Illumina RS-122–21001
Nextera DNA Library Preparation Kit Illumina FC-121–1030
Kapa Stranded RNA Hyperprep KAPA Biosystems KR1350
Nextera DNA Library Preparation Kit Illumina FC-131–1002
Deposited Data
Raw and analyzed data This paper GSE136776
Galgal4 reference genome International Chicken Genome Consortium https://www.ncbi.nlm.nih.gov/assembly/GCF_000002315.3/
droNov1 reference genome Harvard University https://www.ncbi.nlm.nih.gov/assembly/GCF_003342905.1/
Experimental Models: Organisms/Strains
White leghorn Chickens Charles River (MA)
Hatching emu eggs Floeck’s Country Farms (NM)
Oligonucleotides
Primers for cloning, qPCR, probe, see supplement This paper N/A
Recombinant DNA
pCAGG-Fgf10-IRES-Venus This paper N/A
β-actin GFP [39,40]
β-actin Sall1-enhancer: GFP This paper N/A
β-actin Sall1(ets mutant)-enhancer: GFP This paper N/A
pCI-H2B-RFP Addgene 92398
Software and Algorithms
STAR [41] https://github.com/alexdobin/STAR
FeatureCounts [42]
DESeq2 [43] https://bioconductor.org/packages/release/bioc/html/DESeq2.html
Imagej https://imagej.nih.gov/ij/
NGmerge https://github.com/harvardinformatics/NGmerge
Bowtie2 [44] http://bowtie-bio.sourceforge.net/bowtie2/index.shtml
Bedtools2 [45] https://bedtools.readthedocs.io/en/latest/
MACS2 https://github.com/taoliu/MACS
GAT [46] https://github.com/AndreasHeger/gat
PIQ [33] http://piq.csail.mit.edu/download.html
Picard https://broadinstitute.github.io/picard/
HAL [47] https://github.com/ComparativeGenomicsToolkit/hal
removeChrom (https://github.com/jsh58/harvard/blob/master/removeChrom.py

Highlights.

The emu forelimbs and hindlimbs are induced at the same stages as in the chick.

The emu forelimb develops later than chicks’ due to lower mesenchymal proliferation.

The lower proliferation is due to lower expression of Fgf10 in the emu forelimb.

Fgf10 is lower due to emu-specific differences in regulation of the Fgf10 locus.

Acknowledgements:

The authors wish to thank Yuji Atsuta for technical advice, Gunter Wagner and an anonymous reviewer for helpful comments on the manuscript, and Tim Sackton for helpful discussions on sequencing data analysis. This work was supported by NIAMS/NIH grant 5F32AR067097 to J.J.Y. and NIH grant HD03443 to C.J.T.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Declaration of Interests: The authors claim no competing interests.

References

  • 1.Hackett SJ, Kimball RT, Reddy S, Bowie RCK, Braun EL, Braun MJ, et al. (2008). A phylogenomic study of birds reveals their evolutionary history. Science 320, 1763–8. doi: 10.1126/science.1157704. [DOI] [PubMed] [Google Scholar]
  • 2.Gill FB, Donsker D. (2014). IOC World Bird List (v 4.3). doi: 10.14344/IOC. [DOI]
  • 3.Jarvis ED, Mirarab S, Aberer AJ, Li B, Houde P, Li C, et al. (2014). Whole-genome analyses resolve early branches in the tree of life of modern birds. Science 346, 1320–31. doi: 10.1126/science.1253451. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Jenkins FA, Dial KP, Goslow GE. (1988). A cineradiographic analysis of bird flight: the wishbone in starlings is a spring. Science 241, 1495–8. doi: 10.1126/science.241.4872.1495. [DOI] [PubMed] [Google Scholar]
  • 5.Gatesy SM, Dial KP. (1996). Locomotor modules and the evolution of avian flight. Evolution 50, 331–40. doi: 10.1111/j.1558-5646.1996.tb04496.x. [DOI] [PubMed] [Google Scholar]
  • 6.Schmidt-Nielsen K (1972). Locomotion: energy cost of swimming, flying, and running. Science 177, 222–8. [DOI] [PubMed] [Google Scholar]
  • 7.Trewick SA. (1997). Flightlessness and phylogeny amongst endemic rails (Aves:Rallidae) of the New Zealand region. Philos Trans R Soc Lond, B, Biol Sci 352, 429–46. doi: 10.1098/rstb.1997.0031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Organ CL, Shedlock AM, Meade A, Pagel M, Edwards SV. (2007). Origin of avian genome size and structure in non-avian dinosaurs. Nature 446, 180–4. doi: 10.1038/nature05621. [DOI] [PubMed] [Google Scholar]
  • 9.Baker AJ, Haddrath O, McPherson JD, Cloutier A. (2014). Genomic support for a moa-tinamou clade and adaptive morphological convergence in flightless ratites. Mol Biol Evol 31, 1686–96. doi: 10.1093/molbev/msu153. [DOI] [PubMed] [Google Scholar]
  • 10.Yonezawa T, Segawa T, Mori H, Campos PF, Hongoh Y, Endo H, et al. (2017). Phylogenomics and Morphology of Extinct Paleognaths Reveal the Origin and Evolution of the Ratites. Curr Biol 27, 68–77. doi: 10.1016/j.cub.2016.10.029. [DOI] [PubMed] [Google Scholar]
  • 11.Davies SJJF, Bamford M. (2002). Ratites and Tinamous. Oxford University Press on Demand. [Google Scholar]
  • 12.Faux C, Field DJ. (2017). Distinct developmental pathways underlie independent losses of flight in ratites. Biol Lett 13. doi: 10.1098/rsbl.2017.0234. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Cloutier A, Sackton TB, Grayson P, Clamp M, Baker AJ, Edwards SV. (2019). Whole-Genome Analyses Resolve the Phylogeny of Flightless Birds (Palaeognathae) in the Presence of an Empirical Anomaly Zone. Syst Biol 215, 403. doi: 10.1093/sysbio/syz019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Parker WK. (1888). On the presence of Claws in the Wings of the Ratitæ. Ibis 30, 124–8. doi: 10.1111/j.1474-919X.1888.tb07729.x. [DOI] [Google Scholar]
  • 15.Nagai H, Mak S-S, Weng W, Nakaya Y, Ladher R, Sheng G. (2011). Embryonic development of the emu, Dromaius novaehollandiae. Dev Dyn 240, 162–75. doi: 10.1002/dvdy.22520. [DOI] [PubMed] [Google Scholar]
  • 16.de Bakker MAG, Fowler DA, Oude den K, Dondorp EM, Navas MCG, Horbanczuk JO, et al. (2013). Digit loss in archosaur evolution and the interplay between selection and constraints. Nature 500, 445–8. doi: 10.1038/nature12336. [DOI] [PubMed] [Google Scholar]
  • 17.Bickley SRB, Logan MPO. (2014). Regulatory modulation of the T-box gene Tbx5 links development, evolution, and adaptation of the sternum. Proc Natl Acad Sci USA 111, 17917–22. doi: 10.1073/pnas.1409913111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Wachtler F, Christ B, Jacob HJ. (1981). On the determination of mesodermal tissues in the avian embryonic wing bud. Anat Embryol 161, 283–9. [DOI] [PubMed] [Google Scholar]
  • 19.Pearse RV, Scherz PJ, Campbell JK, Tabin CJ. (2007). A cellular lineage analysis of the chick limb bud. Dev Biol 310, 388–400. doi: 10.1016/j.ydbio.2007.08.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Chevallier A, Kieny M, Mauger A. (1977). Limb-somite relationship: origin of the limb musculature. J Embryol Exp Morphol 41, 245–58. [PubMed] [Google Scholar]
  • 21.Farlie PG, Davidson NM, Baker NL, Raabus M, Roeszler KN, Hirst C, et al. (2017). Co-option of the cardiac transcription factor Nkx2.5 during development of the emu wing. Nat Commun 8, 132. doi: 10.1038/s41467-017-00112-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Smith CA, Farlie PG, Davidson NM, Roeszler KN, Hirst C, Oshlack A, et al. (2016). Limb patterning genes and heterochronic development of the emu wing bud. Evodevo 7, 26. doi: 10.1186/s13227-016-0063-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Funayama N, Sato Y, Matsumoto K, Ogura T, Takahashi Y. (1999). Coelom formation: binary decision of the lateral plate mesoderm is controlled by the ectoderm. Development 126, 4129–38. [DOI] [PubMed] [Google Scholar]
  • 24.Gros J, Tabin CJ. (2014). Vertebrate limb bud formation is initiated by localized epithelial-to-mesenchymal transition. Science 343, 1253–6. doi: 10.1126/science.1248228. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kardon G, Harfe BD, Tabin CJ. (2003). A Tcf4-positive mesodermal population provides a prepattern for vertebrate limb muscle patterning. Dev Cell 5, 937–44. [DOI] [PubMed] [Google Scholar]
  • 26.Ng JK, Kawakami Y, Buscher D, Raya A, Itoh T, Koth CM, et al. (2002). The limb identity gene Tbx5 promotes limb initiation by interacting with Wnt2b and Fgf10. Development 129, 5161–70. [DOI] [PubMed] [Google Scholar]
  • 27.Ohuchi H, Nakagawa T, Yamamoto A, Araga A, Ohata T, Ishimaru Y, et al. (1997). The mesenchymal factor, FGF10, initiates and maintains the outgrowth of the chick limb bud through interaction with FGF8, an apical ectodermal factor. Development 124, 2235–44. [DOI] [PubMed] [Google Scholar]
  • 28.Lewandowski JP, Pursell TA, Rabinowitz AH, Vokes SA. (2014). Manipulating gene expression and signaling activity in cultured mouse limb bud cells. Dev Dyn 243, 928–36. doi: 10.1002/dvdy.24128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Mariani FV, Ahn CP, Martin GR. (2008). Genetic evidence that FGFs have an instructive role in limb proximal-distal patterning. Nature 453, 401–5. doi: 10.1038/nature06876. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Seki R, Li C, Fang Q, Hayashi S, Egawa S, Hu J, et al. (2017). Functional roles of Aves class-specific cis-regulatory elements on macroevolution of bird-specific features. Nat Commun 8, 14229. doi: 10.1038/ncomms14229. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Farrell ER, Münsterberg AE. (2000). csall is controlled by a combination of FGF and Wnt signals in developing limb buds. Dev Biol 225, 447–58. doi: 10.1006/dbio.2000.9852. [DOI] [PubMed] [Google Scholar]
  • 32.Yamamoto-Shiraishi Y-I, Higuchi H, Yamamoto S, Hirano M, Kuroiwa A. (2014). Etv1 and Ewsr1 cooperatively regulate limb mesenchymal Fgf10 expression in response to apical ectodermal ridge-derived fibroblast growth factor signal. Dev Biol 394, 181–90. doi: 10.1016/j.ydbio.2014.07.022. [DOI] [PubMed] [Google Scholar]
  • 33.Sherwood RI, Hashimoto T, O’Donnell CW, Lewis S, Barkal AA, van Hoff JP, et al. (2014). Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape. Nat Biotechnol 32, 171–8. doi: 10.1038/nbt.2798. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Lopez-Rios J, Duchesne A, Speziale D, Andrey G, Peterson KA, Germann P, et al. (2014). Attenuated sensing of SHH by Ptch1 underlies evolution of bovine limbs. Nature 511, 46–51. doi: 10.1038/nature13289. [DOI] [PubMed] [Google Scholar]
  • 35.Ros MA, Dahn RD, Fernandez-Teran M, Rashka K, Caruccio NC, Hasso SM, et al. (2003). The chick oligozeugodactyly (ozd) mutant lacks sonic hedgehog function in the limb. Development 130, 527–37. [DOI] [PubMed] [Google Scholar]
  • 36.Keyte AL, Smith KK. (2010). Developmental origins of precocial forelimbs in marsupial neonates. Development 137, 4283–94. doi: 10.1242/dev.049445. [DOI] [PubMed] [Google Scholar]
  • 37.Dowling A, Doroba C, Maier JA, Cohen L, VandeBerg J, Sears KE. (2016). Cellular and molecular drivers of differential organ growth: insights from the limbs of Monodelphis domestica. Dev Genes Evol 226, 235–43. doi: 10.1007/s00427-016-0549-0. [DOI] [PubMed] [Google Scholar]
  • 38.Sackton TB, Grayson P, Cloutier A, Hu Z, Liu JS, Wheeler NE, et al. (2019). Convergent regulatory evolution and loss of flight in paleognathous birds. Science 364, 74–8. doi: 10.1126/science.aat7244. [DOI] [PubMed] [Google Scholar]
  • 39.Ochi H, Tamai T, Nagano H, Kawaguchi A, Sudou N, Ogino H. (2012). Evolution of a tissue-specific silencer underlies divergence in the expression of pax2 and pax8 paralogues. Nat Commun 3, 848. doi: 10.1038/ncomms1851. [DOI] [PubMed] [Google Scholar]
  • 40.Onimaru K, Kuraku S, Takagi W, Hyodo S, Sharpe J, Tanaka M. (2015). A shift in anterior-posterior positional information underlies the fin-to-limb evolution. Elife 4. doi: 10.7554/eLife.07048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Liao Y, Smyth GK, Shi W. (2014). featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–30. doi: 10.1093/bioinformatics/btt656. [DOI] [PubMed] [Google Scholar]
  • 43.Love MI, Huber W, Anders S. (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15, 550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Langmead B, Salzberg SL. (2012). Fast gapped-read alignment with Bowtie 2. Nat Methods 9, 357–9. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Quinlan AR, Hall IM. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–2. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Heger A, Webber C, Goodson M, Ponting CP, Lunter G. (2013). GAT: a simulation framework for testing the association of genomic intervals. Bioinformatics 29, 2046–8. doi: 10.1093/bioinformatics/btt343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Hickey G, Paten B, Earl D, Zerbino D, Haussler D. (2013). HAL: a hierarchical format for storing and analyzing multiple genome alignments. Bioinformatics 29, 1341–2. doi: 10.1093/bioinformatics/btt128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.McGrew MJ, Sherman A, Ellard FM, Lillico SG, Gilhooley HJ, Kingsman AJ, et al. (2004). Efficient production of germline transgenic chickens using lentiviral vectors. EMBO Reports 5, 728–33. doi: 10.1038/sj.embor.7400171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Hamburger V, Hamilton HL. (1951). A series of normal stages in the development of the chick embryo. Dev Dyn. vol. 195 1992. doi: 10.1002/aja.1001950404. [DOI] [PubMed] [Google Scholar]
  • 50.Gros J, Hu JK-H, Vinegoni C, Feruglio PF, Weissleder R, Tabin CJ. (2010). WNT5A/JNK and FGF/MAPK pathways regulate the cellular events shaping the vertebrate limb bud. Curr Biol 20, 1993–2002. doi: 10.1016/j.cub.2010.09.063. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.McGlinn E, Holzman MA, Mansfield JH. (2019). Detection of Gene and Protein Expression in Mouse Embryos and Tissue Sections. Methods Mol Biol 1920, 183–218. doi: 10.1007/978-1-4939-9009-2_12. [DOI] [PubMed] [Google Scholar]
  • 52.Buenrostro JD, Wu B, Chang HY, Greenleaf WJ. (2015). ATAC-seq: A Method for Assaying Chromatin Accessibility Genome-Wide. Curr Protoc Mol Biol 109, 21.29.1–9. doi: 10.1002/0471142727.mb2129s109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Cantarel BL, Korf I, Robb SMC, Parra G, Ross E, Moore B, et al. (2008). MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res 18, 188–96. doi: 10.1101/gr.6743907. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Altenhoff AM, Gil M, Gonnet GH, Dessimoz C. (2013). Inferring hierarchical orthologous groups from orthologous gene pairs. PLoS ONE 8, e53786. doi: 10.1371/journal.pone.0053786. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Stamatakis A (2014). RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–3. doi: 10.1093/bioinformatics/btu033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, et al. (2008). Model-based analysis of ChIP-Seq (MACS). Genome Biol 9, R137. doi: 10.1186/gb-2008-9-9-r137. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

Data Availability Statement

The datasets and code utilized in this study are available at GEO (accession TBD at https://www.ncbi.nlm.nih.gov/geo/) and on GitHub at https://github.com/philgrayson/young_grayson_CB2019

RESOURCES