Skip to main content
Springer logoLink to Springer
. 2021 Feb 2;89(1):103–116. doi: 10.1007/s00239-021-09995-z

Structural and Genetic Determinants of Convergence in the Drosophila tRNA Structure–Function Map

Julie Baker Phillips 1,2, David H Ardell 1,3,
PMCID: PMC7884595  PMID: 33528599

Abstract

The evolution of tRNA multigene families remains poorly understood, exhibiting unusual phenomena such as functional conversions of tRNA genes through anticodon shift substitutions. We improved FlyBase tRNA gene annotations from twelve Drosophila species, incorporating previously identified ortholog sets to compare substitution rates across tRNA bodies at single-site and base-pair resolution. All rapidly evolving sites fell within the same metal ion-binding pocket that lies at the interface of the two major stacked helical domains. We applied our tRNA Structure–Function Mapper (tSFM) method independently to each Drosophila species and one outgroup species Musca domestica and found that, although predicted tRNA structure–function maps are generally highly conserved in flies, one tRNA Class-Informative Feature (CIF) within the rapidly evolving ion-binding pocket—Cytosine 17 (C17), ancestrally informative for lysylation identity—independently gained asparaginylation identity and substituted in parallel across tRNAAsn paralogs at least once, possibly multiple times, during evolution of the genus. In D. melanogaster, most tRNALys and tRNAAsn genes are co-arrayed in one large heterologous gene cluster, suggesting that heterologous gene conversion as well as structural similarities of tRNA-binding interfaces in the closely related asparaginyl-tRNA synthetase (AsnRS) and lysyl-tRNA synthetase (LysRS) proteins may have played a role in these changes. A previously identified Asn-to-Lys anticodon shift substitution in D. ananassae may have arisen to compensate for the convergent and parallel gains of C17 in tRNAAsn paralogs in that lineage. Our results underscore the functional and evolutionary relevance of our tRNA structure–function map predictions and illuminate multiple genomic and structural factors contributing to rapid, parallel and compensatory evolution of tRNA multigene families.

Supplementary Information

The online version contains supplementary material available at 10.1007/s00239-021-09995-z.

Keywords: Class-informative feature (CIF), Ion-binding pocket, Parallel substitutions, Convergent evolution, Structure–function map

Introduction

Transfer RNAs (tRNAs) were the first family of RNAs to be directly sequenced (Holley 1965) and the first to be solved by X-ray crystallography (Holley et al. 1965). Historically, algorithms to estimate phylogeny and substitution patterns were tested on tRNA genes (Cedergren et al. 1981; Eigen et al. 1989). Early in the genome era, it was reported that tRNA genes can evolve to switch their functional (codon-reading) identities through anticodon shift substitutions, which entail both synonymous and non-synonymous substitutions in anticodons (Saks et al. 1998). However, the small sizes and high similarities of tRNA genes pose obstacles to inferring their orthology, which is needed to better understand the evolutionary processes underlying functional turnover of tRNA genes. An important step forward came from the “micro-syntenic" approach to infer tRNA gene orthology using flanking sequences, first applied in Drosophila (Rogers et al. 2010) and later to other eukaryotes (Rogers and Griffiths-Jones 2014). These studies revealed that functional turnover of tRNA genes through anticodon shift substitutions is more frequent and widespread than previously known. However, Rogers and Griffiths-Jones (2014) were unable to discern whether anticodon shift substitutions occur more often within or between tRNA families cognate to the two conserved and ancient superfamilies of aminoacyl-tRNA synthetases (aaRSs), called Classes I and II (Eriani et al. 1990). The two superfamilies may be further divided into three sub-classes each (Cusack 1997), all of which pre-date the divergence of bacteria, archaea and eukaryotes, as exemplified by the consistency with which aaRS paralogs may be used to root the statistical Tree of Life (Brown and Doolittle 1995).

More recent advances in ortholog estimation for tRNA genes exploited positional homology and the organization of tRNA genes as repeated elements in tandem gene arrays, revealing both the great extent of functional turnover and an important role for gene conversion in tRNA evolution (Bermudez-Santana et al. 2010; Velandia-Huerto et al. 2016). These advances in ortholog estimation for tRNA genes have made it possible for the first time to undertake a detailed analysis of substitution rates and patterns in tRNA genes, which is one of two requirements to understand how tRNA genes evolve to switch functions. The second requirement is a means to predict the functional significance of tRNA sequence features—what we call the “tRNA structure–function map.”

In earlier work, we developed an approach to estimate tRNA structure–function maps from pooled, structurally aligned tRNA gene complements inferred from one or more related genomes. Our approach integrates sequence information across all tRNA functional gene families at once, using statistics on structure-conditioned functional information (Freyhult et al. 2006). The relevance of our “information criterion” to predict the tRNA structure–function map stems from the biophysics of translation, assuming promiscuous interactions across all species of tRNA-binding proteins and tRNAs co-expressed in the same cellular domain or compartment, with association rates that increase proportionally with concentrations (which we estimate for tRNAs by proxy from gene copy numbers, as in the tRNA Adaptation Index, dos Reis et al. 2004; Sabi et al. 2017) and with aminoacylation probabilities that depend on matching and mis-matching of structural and dynamic (motional) features across all interacting species operating in parallel (Collins-Hed and Ardell 2019). No matter how the phenotypic expression of a given base or base-pair contributes to the classification of tRNAs into substrates and non-substrates by tRNA-binding proteins, whether directly at the binding interface, as a substrate for a critical base modification, or indirectly, by contributing to recognition by base modifying enzymes, through indirect effects on the shape and motion of entire molecules, or via allosteric circuits connecting binding interfaces and active sites in complexes (Sethi et al. 2009), the theoretical expectation under selection for translational accuracy is that the genetic bases of those identifying phenotypic traits will become increasingly restricted to the tRNA functional classes that rely on them for their identities (Collins-Hed and Ardell 2019). We call our predictions Class-Informative Features (CIFs) and visualize them in graphs called function logos (Freyhult et al. 2006).

Other authors have more recently applied information statistics to measure the conservation of features within tRNA gene families analyzed independently of one another, in order to predict tRNA identity elements (Tamaki et al. 2018; Zamudio and José 2018). The first set of authors combined this information with structural and bioinformatic models of tRNA–protein interaction interfaces. Both sets of authors employ a “conservation criterion” to integrate tRNA sequence variation within functional classes, based on the difference HX-HXl|Y=y between an empirical or assumed composition-based background entropy of structural features H(X) and the entropy of structural features H(Xl|Y=y) at a site l in tRNA genes of a given function y. Instead, we employ an “information criterion” that integrates tRNA sequence variation across functional classes, based on the difference H(Y)-HY|Xl=x between an empirical background entropy of tRNA functions H(Y) and the entropy of functions H(Y|Xl=x) among tRNA genes that embody structural feature x at site l. The information criterion has the operational interpretation that tRNA-binding proteins themselves exploit the information derived from CIFs to identify their substrates, while the conservation criterion risks conflating generic tRNA features with class-specific ones, overlooks potential alternative informative feature sets for the same functional identity, and discounts co-evolution of tRNA genes with tRNA-binding protein genes such as aminoacyl-tRNA synthetases. Indeed, using other machine learning approaches, Galili et al. (2016) predicted tRNA identity elements that varied across archaeal phyla. In our own work, we have shown that tRNA Class-Informative Features (CIFs) vary across Bacteria (Ardell and Andersson 2006) and we were then able to use this variation to address a challenging problem in the phylogenetics of Alphaproteobacteria (Amrine et al. 2014). More recently, we showed that our supervised machine learning phyloclassification algorithm CYANO-MLP could robustly classify cyanobacterial genomes and hindcast the cyanobacterial progenitor of plastids based solely on tRNA CIFs and tRNA gene complements (Lawrence et al. 2019).

Even though our tRNA structure–function maps are based on an information criterion rather than a conservation criterion, we recently showed that tRNA CIFs, including Class-Informative Base-Pairs (CIBPs) and Class-Informative Mis-Pairs (CIMPs), are highly conserved within trypanosomes and between trypanosomes and humans, even while showing evidence of co-evolutionary divergence (Kelly et al. 2020). Furthermore, our tRNA CIF annotations could predict differential susceptibility to inhibition by chemical agents of homologous aminoacyl-tRNA synthetases (aaRSs) from trypanosomes and humans (Kelly et al. 2020). We also found that tRNA gene clusters are conserved in trypanosomes and show clear evidence of rapid evolution by duplication, deletion, and rearrangements (Kelly et al. 2020) consistent with findings from genome comparisons of diverse eukaryotic groups (Bermudez-Santana et al. 2010; Velandia-Huerto et al. 2016).

In earlier work, as part of the Drosophila Twelve Genomes Consortium, we contributed tRNA gene annotations to FlyBase (Bergman and Ardell 2014) and an analysis combining data from multiple ortholog sets of microRNA genes to estimate structurally partitioned evolutionary rates over different structurally defined categories of sites and site-pairs (Drosophila 12 Genomes Consortium 2007). In the present work, we extend our molecular evolutionary analysis approach to Drosophila tRNA genes, exploiting their high conservation and structural conformity (Wolfson et al. 2001) to estimate and compare evolutionary rates across different tRNA structural components of all functions at individual single- and paired-sites in tRNA genes from twelve species of Drosophila. As part of this work, we developed a strategy to optimize the exclusion of species to maximize the length of concatenated alignments across tRNA ortholog sets. After fitting models to our optimized alignments with MrBayes (Huelsenbeck and Ronquist 2001; Ronquist et al. 2012), we discovered that one of several metal ion-binding pockets has evolved rapidly in multiple functional classes of Drosophila tRNA genes. Integrating this with tRNA CIF predictions in thirteen species of flies, we found that this rapid evolution is associated with repeated, convergent gains (and possible losses and/or regains) of a tRNA CIF in one tRNA functional gene family, resulting in parallel substitutions in multiple tRNA genes with orthologs on different chromosomes in D. melanogaster. We were able to identify multiple structural and genomic factors that have likely contributed to this convergent evolution of tRNA CIFs in the Drosophila genus. Our results suggest that anticodon shift substitutions may play a compensatory role in evolution of the tRNA structure–function map.

Results

We built a custom database of tRNA genes for the Drosophila twelve genomes based on FlyBase release 2008–07 (Tweedie et al. 2009) downloaded on October 19, 2011, which contained a total of 3494 tRNA genes. On this set, we integrated orthology annotations from Rogers et al. (2010), COVE scores from tRNAscan-SE 1.3.1 (Lowe and Eddy 1997), and initiator tRNA gene predictions from TFAM v1.3 (Ardell and Andersson 2006; Tåquist et al. 2007). We re-folded FlyBase tRNA gene models in tRNAscan-SE and ARAGORN v1.2.34 (Laslett and Canback 2004) as an additional source of functional and structural predictions. Before turning to our main body of results concerning molecular evolution and Class-Informative Features in Drosophila tRNA genes, we briefly note an important discovery that we made regarding a previously described anticodon shift substitution in light of our updated annotations of initiator tRNA genes. Our tRNA gene annotations are provided in Supplementary Data.

D. simulans Contains a Non-Canonical Initiator tRNA Gene Created by an Anticodon Shift Substitution

Initiator tRNAs are a distinct functional class that read start codons only, and ordinarily carry CAU anticodons in common with elongator tRNAMetCAU. The first generation of production tRNA gene-finders, tRNAscan-SE v.1 and Aragorn, erroneously annotate initiator tRNA genes as elongator tRNAMet genes based on predicted anticodons. However, tRNAscan-SE v.2.0 (Chan et al. 2019) and TFAM both use sequence profiles to functionally annotate initiator tRNA genes. We found that TFAM could annotate initiator tRNA gene predictions consistently with both experimentally based annotations in D. melanogaster, and also across species, in that if TFAM annotated any gene in any ortholog set from Rogers et al. (2010) as an initiator tRNA gene, then every other gene in that ortholog set would also be independently annotated by TFAM as an initiator tRNA gene (Bergman and Ardell 2014 and this work). We annotated between five and seven initiator tRNA genes per Drosophila genome, for a total of 70 annotated initiator tRNA genes across all species, of which 66 belonged to the ortholog sets from Rogers et al. (2010).

We found that one alloacceptor anticodon shift substitution (which converts the functional identity of a tRNA from one amino acid to another), previously reported and validated as tRNAMetCAU tRNAThrCGU (or in their notation, CAT:M → CGT:T) in Drosophila simulans (Rogers et al. 2010; Rogers and Griffiths-Jones 2014), in fact represents the evolution of a non-canonical initiator tRNA gene in D. simulans, in the sense that its predicted tRNA product does not contain the methionine anticodon CAU. This non-canonical initiator tRNA gene in D. simulans with FlyBase gene ID gn0256165 and transcript ID tr0296323 evolved as part of ortholog set 183. This ortholog set contains initiator tRNA orthologs in all eight species of Sophophora excluding D. willistoni. The D. simulans ortholog carries only a single AG anticodon shift substitution at Sprinzl position 35 (as described in Sprinzl et al. 1998, Sprinzl coordinates are a standardized coordinate system for the consensus universal secondary structure of tRNAs as well as conserved, more sub-function-specific structures like the long variable arms of tRNALeu and tRNASer). The non-canonical initiator tRNA gene is identical in sequence to all other initiator tRNA genes that we annotated in D. simulans, except for the single anticodon shift substitution. Further analysis is required to explore the functional significance of this gene.

Optimizing Species Exclusion to Maximize Gap-Free Alignment Length of Concatenated tRNA Ortholog Sets

We undertook a global and unbiased analysis of site-specific substitution rates and patterns in Drosophila tRNA genes at single-site and site-pair resolution. To do this, we analyzed a total of 753 orthologous gene sets from Rogers et al. (2010) encompassing 3218 unique FlyBase tRNA transcript IDs, a subset of our re-annotated FlyBase set. We found the representation of species to be uneven across the Rogers et al. (2010) ortholog sets. Only 47 sets, about 6%, contained orthologs from all 12 Drosophila species (Table 1). One species in particular, D. willistoni, was represented in very few sets. We proceeded to remove this species from our subsequent substitution rate analyses. We also removed ortholog sets that contained isoacceptor or alloacceptor anticodon shift substitutions, uncertain functional annotations, or predicted pseudogenes.

Table 1.

Statistics on alignments of concatenated tRNA ortholog sets from Drosophila species subsets

Species subset Species Ortho. setsa Sites (gap-free) Var. sites (partitions)b P.I. sites (partitions)c
101111110000d 7 147 10,878 (10,598) 2729 (68) 27 (15)
110011100111e 8 85 6290 (6134) 1589 (67) 19 (15)
111111110111f 11 83 6142 (5991) 1554 (67) 20 (15)
111111111111 12 47 3478 (3389) 896 (58) 16 (12)

aNumber of complete tRNA ortholog sets (Rogers et al. (2010)) from that species subset

bNumber of variable gap-free sites (or variable site-partitions excluding gap-free sites, of 74) over concatenated alignment

cNumber of Parsimoniously Informative gap-free sites (or partitions excluding gap-free sites, of 74) over concatenated alignment

dSophophora excluding D. simulans and D. willistoni

eAll species excluding D. sechellia, D. yakuba, D. persimilis, and D. willistoni

fAll species excluding D. willistoni

We wished to integrate data from the largest number of ortholog sets that we could in order to compare substitution rates across sites in an unbiased manner. However, we also wished to avoid having a large number of missing or unidentified orthologs in the concatenated alignments that we made from Rogers et al. (2010) ortholog sets, so as to reduce error arising from missing data in our results associated with large blocks of gap characters. We therefore computed statistics over 562 =117+118+119+1110+1111 automatically generated concatenated gene alignments corresponding to 562 subsets of genome assemblies from seven or more of the eleven Drosophila species excluding D. willistoni, in order to find the one species subset that resulted in concatenated alignments of the most ortholog sets and the greatest gap-free length. To make these concatenated alignments for each species subset, first we structurally aligned all tRNA genes together, and then we extracted aligned tRNA gene sequences for every ortholog set containing sequences from at least all the species belonging to the defined species subset. We concatenated these extracted ortholog set alignments together and mapped all sites in this concatenated alignment into separate site-partitions corresponding to each Sprinzl coordinate.

We found that a single species subset, labeled as “101111110000” in Table 1 and representing Sophophora excluding D. simulans and D. willistoni, yielded the longest gap-free alignment concatenating the most ortholog sets, the greatest numbers of variable and parsimoniously informative sites, and the greatest average number of pairwise differences per site among all alignments we examined. We proceeded to focus on this alignment for downstream analysis as well as two other ones for the sake of comparison: a randomly picked one with a nearly complementary pattern of species exclusion (“110011100111,” excluding D. sechellia, D. yakuba, D. persimilis, and D. willistoni), and one with all but one species (“111111110111,” excluding D. willistoni).

Substitution Rates and Patterns in Drosophila tRNA Genes

To compute site-specific substitution rates from our concatenated alignments of tRNA ortholog sets, we compared substitution rates per site or per site-pair across different site-partitions of our concatenated alignments corresponding either to tRNA secondary structural elements or to individual sites defined by Sprinzl structurally standardized coordinates (Sprinzl and Vassilenko 2005), using MrBayes 3.2.1 (Huelsenbeck and Ronquist 2001; Ronquist et al. 2012), aggregating over different functional classes of tRNA gene orthologs and using the fixed known species tree (Drosophila 12 Genomes Consortium 2007). We fitted unpaired sites and loops with the General Time Reversible model (Lanave et al. 1984; Tavaré 1986; Rodrıǵuez et al. 1990) with invariant sites (GTR + I), and paired-sites and stems with the Doublet(GTR) + I model, also with invariant sites (Ronquist et al. 2012). We obtained very similar results fitting the (GTR + I + Gamma) model (data provided in Supplementary Materials) or using the Hasegawa–Kishino–Yano (HKY) model (Hasegawa et al. 1985) with or without Gamma-distributed site-rate variation (Phillips 2003 and Figures S1 and S2 in Supplementary Online Materials). We ran MrBayes with the option “ratemult = scaled,” which implies that all rates reported and shown in Figs. 1 and 2 are scaled so that the mean rate across all site-partitions is 1.0 substitutions per site or site-pair.

Fig. 1.

Fig. 1

95% Credible intervals, interquartile ranges, and medians of relative substitution rates by secondary structural elements in Drosophila tRNA genes as calculated in MrBayes 3.2.1 using the GTR + I/Doublet(GTR + I) models for loops/stems and the partitioned concatenated alignments indicated by bit-sets as defined in Table 1

Fig. 2.

Fig. 2

95% Credible intervals, interquartile ranges, and medians of relative substitution rates by individual site/Sprinzl coordinate in Drosophila tRNA genes as calculated in MrBayes 3.2.1 using the GTR + I model for individual sites and the partitioned concatenated alignments indicated by bit-sets as defined in Table 1

In our coarse-grained analysis of substitution rates by secondary structural elements, we observed that, regardless of which alignments we used, the largest substitution rates occur in the D-loop and acceptor stem followed closely by the T-loop (Fig. 1). Considering the relative constraint that acceptor stems are expected to have from identity-driven interactions with proteins (Giegé et al. 1998), the acceptor stem shows a surprisingly high rate of evolution. Furthermore, regardless of which alignments or models we used, in our analysis of substitution rates by individual sites, we found that three sites—corresponding to Sprinzl coordinates 16, 17, and 60—showed markedly higher rates of substitution than any other, followed by sites 15, 45, and 59 (Fig. 2).

Sprinzl coordinates 16, 17, and 60 are three among eight sites that form an extended ion-binding pocket (Sprinzl coordinates 15, 16, 17, 18, 19, 20, 59, and 60) between the D- and T-arms at the interface of the two tRNA helical domains (Behlen et al. 1990), as shown in Fig. 3. This pocket was first described as site 1 in the original orthorhombic crystal form (Holbrook et al. 1977) and site 3 in the monoclinic structure (Jack et al. 1977) and was later shown to potentially bind different metal ions including magnesium, cobalt, manganese, and lead (Jack et al. 1977; Holbrook et al. 1977; Behlen et al. 1990; Shi and Moore 2000). More recent structural work better resolved the structure of Sprinzl site 16, which is a fairly conserved uracil among eukaryotes (Marck and Grosjean 2002), most likely post-translationally modified to dihydrouridine, and demonstrated the conformational sensitivity of this to ionic conditions (Shi and Moore 2000). The hypergeometric probability that three sites with elevated substitution rates occur within the eight sites that form the extended ion-binding pocket, out of 74 possible sites, is 83660743=8.6×10-4.

Fig. 3.

Fig. 3

Rapidly evolving sites in Drosophila tRNAs are part of a conserved, structurally plastic ion-binding pocket near the interface of the two tRNA helical domains, indicated by colored nucleotides and labeled in the inset panel at higher magnification. This image visualizes PDB structure accession ID 1EHZ (Shi and Moore 2000) and was rendered in VMD 1.9.2 (Humphrey et al. 1996)

In the alignment with the most species (111111110111), we found that 45 ortholog groups spanning 14 functional classes carry substitutions in sites 16, 17, and 60 of this ion-binding pocket, as shown in Table S1 of Supplementary Materials.

Chromosomal gene location was significantly associated with substitutions in pocket sites 16, 17, and 60 (X2=56.3266, N=692, p=6.96×10-11), and extended pocket sites 15, 16, 17, 18, 19, 20, 59, and 60 (X2=87.4858, N=726, p<2.2×10-16), Supplementary Materials, Table S2. Substitutions in sites 16, 17, or 60 of the pocket were not associated with alloacceptor class switches overall (X2=20.9616, N=672, p=0.523), nor were substitutions in the extended pocket sites 15, 16, 17, 18, 19, 20, 59, and 60 (X2=23.2573, N=703, p=0.387). Ortholog sets containing anticodon shift substitutions more often exhibited substitutions in pocket sites 16,17, and 60 (X2=8.8797, N=713, p=0.00288), but not extended pocket sites 15, 16, 17, 18, 19, 20, 59, and 60 (X2=2.7304, N=696, p=0.0985). We provide estimates of transition and transversion rates for Drosophila tRNA genes in Supplementary Materials Table S3.

Class-Informative Base-Pairs and Base Mis-Pairs in Drosophila melanogaster tRNAs

As described in Kelly et al. (2020), we reimplemented our previously published algorithm to estimate structure-conditioned functional information statistics from Freyhult et al. (2006) extending it to compute functional information for all sixteen possible base-pair or base mis-pair features occurring in any paired-sites of the consensus secondary structure of tRNAs. The program accepts as input a set of multiple alignments, one for each functional sub-family of any RNA or protein multigene family, all mutually structurally aligned, and computes as its output function logo visualizations and tables of statistics on CIFs and their evolution in one or more taxa. The software, called tSFM (“tRNA Structure–Function Mapper") v1.0.1 despite being applicable to any RNA or protein family, uses a permutation approach to measure the significance of CIFs and CIF evolution dissimilarities and corrects for multiple comparisons by controlling Family-Wise Error Rates (FWERs) or False Discovery Rates, such as that of Benjamini and Hochberg (1995). tSFM v.1.0.1 and later versions are freely available for download on GitHub at https://github.com/tlawrence3/tSFM.

We computed tRNA Class-Informative Base-Pairs and Mis-Pairs from structurally aligned Drosophila melanogaster tRNA genes filtered from our custom gene annotation, with tRNAscan-SE 1.3.1 COVE scores of at least 50 bits (tRNAscan-SE 1.3.1 COVE scores estimate the log-likelihood that a given sequence conforms to the consensus primary and secondary structure of tRNAs in general, as opposed to a random sequence of the same composition), removing tRNA genes of indeterminate function and selenocysteine tRNA genes, and leaving 288 tRNA genes remaining. We then computed tRNA Class-Informative Base-Pairs and Base Mis-Pairs from these data using tSFM v.1.0.1, and only retained features with a Benjamini–Hochberg False Discovery Rate of 5%, as shown in Fig. 4. The total height of a stack of letters at any site quantifies the information potentially gained about the functional type of a tRNA by a tRNA-binding protein if it recognizes the specific type of paired-site feature corresponding to that location. The letters within each stack symbolize functional types of tRNAs, wherein IUPAC one-letter amino acid codes represent elongator tRNA aminoacylation identities and “X” symbolizes initiator tRNAs. The relative heights of letters within each stack quantify the over-representation of tRNA functional types carrying that feature relative to the background frequency determined by gene frequencies of functional types (as calculated through the normalized log-odds). Thus, for example a U1:A72 base-pair at Sprinzl coordinates 1 and 72 (or some modification that biosynthetically depends on that base-pair) is informative for tRNAAsp and tRNAGlu in D. melanogaster.

Fig. 4.

Fig. 4

Class-Informative Base-Pairs and Base Mis-Pairs from 288 re-annotated tRNA genes in Drosophila melanogaster with COVE scores of at least 50 bits as computed in tSFM v1.0.1 with a Benjamini–Hochberg False Discovery Rate of 5% (CIFs not meeting this significance threshold are not shown). The total height of a stack of letters quantifies information gained about the functional type of a tRNA by a tRNA-binding protein if it specifically recognizes the paired features indicated. The letters within each stack symbolize functional types of tRNA genes, wherein IUPAC one-letter amino acid codes represent elongator tRNA aminoacylation identities and “X” symbolizes initiator tRNAs. The relative heights of letters within each stack quantify the over-representation of tRNA functional types carrying that feature relative to the background frequency determined by the frequencies of genes of various functional types (as calculated through normalized log-odds)

Figure 4 shows that a great deal of functional information resides in Base-Pairs and Base Mis-Pairs of Drosophila melanogaster tRNAs. Several base mis-pairs are functionally informative, including U13:U22 in the D-arm, which is associated to both tRNAPro and tRNAVal (contained in 32 tRNA genes of 32 total), U31:U39 in the C-arm contained in all six tRNAMet genes, C50:U64 in the T-arm contained in nine of 11 tRNAIle genes, G13:A22 in the D-arm contained in all 42 tRNASer and tRNALeu genes, and a G4:G69 mis-pair in the acceptor stem associated with nine of 16 tRNAThr genes. We show a plot of the significance of paired-site CIFs as a function of CIF information in Supplementary Fig. S3. A text-file containing all statistics for paired-site CIFs in D. melanogaster, including frequencies of CIFs in genes of various functions, as well as p-values and FDRs, is provided in the Supplementary Code and Data.

Conservation and Convergence of Class-Informative Features in Drosophila tRNA Genes

The wealth of information we have on Drosophila evolution provides an unprecedented opportunity to understand how tRNA Class-Informative Features evolve. To examine the conservation of tRNA CIFs in Drosophila, we computed tRNA CIFs in each species independently in the same way as we did for D. melanogaster. In order to polarize evolutionary changes in tRNA CIFs, we additionally analyzed tRNA gene annotations and CIFs in the outgroup species Musca domestica, which diverged from Drosophila between 20 and 80 million years ago (Wiegmann et al. 2003). As shown in Fig. 5, tRNA single-site CIFs are very highly conserved in Drosophila and flies generally. Although this figure is designed to emphasize the similarity of CIF estimates across species, separate detailed logo figures for each species as well as combined logos for base-paired features are provided in the Supplementary Online Materials and confirm strong conservation of both Class-Informative Base-Pairs and Class-Informative Base Mis-Pairs in the Drosophila genus. Broadly speaking, our predictions are very nearly identical in all fly genomes we analyzed, with one visibly clear exception standing out: C17, which occurs within the metal ion-binding pocket that we previously identified as having elevated substitution rates.

Fig. 5.

Fig. 5

Conservation of tRNA single-site CIFs in Drosophila with Musca domestica as an outgroup showing extensive conservation of tRNA CIFs in Drosophila. Tree topology from Drosophila 12 Genomes Consortium (2007), with divergence dates from Tamura et al. (2003) and Hennig et al. (1981)

As shown in Fig. 6, CIF C17 is ancestrally associated with tRNALys in flies (as judged by parsimony from outgroup polarization) but it became associated with all or nearly all tRNAAsn paralogs in at least three phylogenetically distinct Drosophila lineages, one of which, leading to D. ananassae, also underwent the Asn-to-Lys anticodon shift previously reported in Rogers et al. (2010). One interpretation of the evolution of site 17 in tRNAAsn genes is that of multiple gains of C17, but losses and regains cannot be excluded. To underscore the validity and biological significance of our CIF evolution results, we emphasize that all of the 138 tRNAAsn genes analyzed in Fig. 6 have COVE scores of 50 bits are greater—meaning they are above a typical threshold used for inclusion in tRNA gene annotation gene sets—and in fact, all but two of them have scores above 70 bits—consistent with well-folding, typical tRNA sequences—while all but six have scores above 80 bits (for context, the maximum COVE bit-score over our entire annotation set is 87.38 bits). In addition, 86 of 111 FlyBase tRNAAsn genes belong to the Rogers et al. ortholog sets. For comparison’s sake, ten of 111 FlyBase records for tRNAAsn genes have COVE scores below 50 bits, of which two belong to the Rogers et al. ortholog sets. Of the 29 genes that substituted to C17 that belong to the Rogers et al. ortholog sets, 15 are considered “core genes” in the sense of Rogers et al. 2010. All of them have predicted Asn anticodons and were annotated as tRNAAsn genes by TFAM. In Fig. 7, we show the raw alignment data for tRNAAsn and tRNALys genes in D. ananassae and tRNAAsn genes in D. melanogaster, demonstrating their extreme consistency in sequence with very little variation outside site 17 and the anticodon shift substitution in D. ananassae. While our interpretation rests on the assumption that we have correctly annotated the function of these tRNAAsn genes and that their functions as such have been conserved during their evolution, overall, we believe that our evidence is strong that tRNAAsn genes have undergone repeated parallel evolution in Drosophila.

Fig. 6.

Fig. 6

Evolution of functional association of CIF C17 in flies. Cytosine single-site function logos are from Fig. 5, the tree topology is from Drosophila 12 Genomes Consortium (2007), and divergence dates are from Tamura et al. (2003) and Hennig et al. (1981). The three clades in which C17 became associated with tRNAAsn genes are indicated by purple boxes on the tree, one of which (D. ananassae) also underwent the Asn-to-Lys anticodon shift previously reported in Rogers, Bergman et al. (2010) as indicated in top purple box at figure left. The stacked bar graph at figure right shows the frequencies of U17 or C17 across all tRNAAsn genes in each genome showing that C17 was gained and/or lost in parallel and regained across all or mostly all tRNAAsn genes at least three times during evolution of the Drosophila genus

Fig. 7.

Fig. 7

Structural alignments, visualized in SEAVIEW (Gouy et al. 2010) of all annotated tRNALys and tRNAAsn genes in D. ananassae and all annotated tRNAAsn genes in D. melanogaster, highlighting sites corresponding to anticodons and site 17, as well as the Asn-to-Lys anticodon shift substitution gene studied in Rogers et al. (2010)

In Fig. S4, also in Supplementary Materials, we show that the relative average functional information carried by nucleobases is inversely related to their compositional frequency in D. melanogaster tRNA genes (even though the measure does not depend on the compositional entropies or frequencies of bases themselves). This means that rare bases tend to be more informative for function in Drosophila tRNAs. The relative frequency of bases in D. melanogaster tRNA genes decreases from G to C to U to A. Their functional information (averaged over all sites and functions) in D. melanogaster tRNA CIFs increases correspondingly.

Discussion

In their analysis of tRNA gene ortholog sets from flies, primates, and other groups, Rogers and Griffiths-Jones (2014) reported no evidence of preferential anticodon shift substitutions within or between genes for functional classes of tRNAs charged by either of the two aminoacyl-tRNA synthetase superfamilies, Class I or Class II. However, they did report that an Asn-to-Lys alloacceptor anticodon shift substitution was one of only two types of anticodon shift substitution that occurred more than once out of a total of 30 types in their dataset, once in Drosophila and twice independently in primates. In this work, we show that in flies, the previously reported Asn-to-Lys anticodon shift substitution co-occurs in one lineage with recruitment of CIF C17 to tRNAAsn genes from tRNALys genes where it is ancestral, that this co-option of CIF C17 from tRNALys to tRNAAsn was multiply gained and/or lost and regained at least three times in Drosophila evolution, and that when it was gained (and/or lost and regained), it changed through parallel substitutions across all or mostly all tRNAAsn genes. We note that AsnRS and LysRS are both Class IIb synthetases, suggesting that perhaps co-option of tRNA CIFs occurs more frequently within aaRS sub-classes, perhaps because aaRSs of the same sub-class have more similar binding interfaces on tRNAs (Giegé et al. 2008). In our recently published theory for the evolution of aaRS–tRNA interaction networks, tRNA mutations potentially influence interactions with multiple aaRSs when they occur inside shared interfaces (Collins-Hed and Ardell 2019). The pattern we report here is consistent with an interpretation that the Asn-to-Lys anticodon shift substitution evolved to compensate the co-option of C17 from Lys to Asn in the D. ananassae lineage. Of course, our interpretations rest generally on the correctness of our functional and structural annotations and the genome assembly and sequence data that underlie them.

Comparing the genomic locations of tRNALys and tRNAAsn genes from the FlyBase data and visualizing them in the UCSC Genome Browser (Kent et al. 2002) for Drosophila melanogaster, we found that eight of ten tRNAAsn genes co-occur in a heterologous gene array along with eight tRNALys genes in both direct and inverted orientations within approximately 45 Kb on Chromosome 2R, far away from two additional tRNAAsn gene singletons, one on 2R and another on 3R. A detailed look at our integrated annotations (provided in the Supplementary Materials) reveals that ortholog sets of tRNAAsn genes that underwent evolution in site 17 contain D. melanogaster genes both inside and outside of this heterologous gene array. Specifically, the Rogers et al. (2010) ortholog sets of tRNAAsn genes that intersect the D. melanogaster heterologous gene array are: sets 46 and 61, containing genes from twelve (respectively 11) species, including from all four species whose genes underwent substitution to C17; ortholog set 62 with genes from seven species including D. ananassae, D. willistoni, and D. mojavensis with substitutions to C17; ortholog set 47 with genes from nine species including D. ananassae, D. willistoni, and D. virilis with C17; ortholog set 48 with genes from seven species including D. ananassae and D. willistoni, with C17; ortholog sets 50 and 52 which contain five (respectively three) genes including one from D. ananassae with C17; and ortholog set 51 with four genes, one from D. ananassae with U17. Outside of the heterologous gene array, ortholog set 138, whose D. melanogaster ortholog resides elsewhere on Chromosome 2R, contains eleven genes with three genes from D. ananassae, D. willistoni, and D. virilis that substituted to C17, while ortholog set 205, whose D. melanogaster ortholog resides on Chromosome 3R, contains genes from all twelve species, including from all four species whose genes underwent substitution to C17.

While a renewed tRNA gene re-annotation effort with long-read assembly data is needed to better understand the causes of the changes we have reported here, the genome location data in D. melanogaster are consistent with a scenario in which gene conversion by illegitimate recombination may have played a role in the convergent and parallel evolution of the tRNA structure–function map in Drosophila. Furthermore, if the organization of tRNAAsn genes in D. melanogaster are broadly representative across species, a simple scenario of concerted gene evolution (Nei and Rooney 2005) is insufficient to explain the parallel and repeated substitutions of C17 (and possibly also U17) in tRNAAsn genes that we observed. Rapid evolution of tRNA gene arrays through rearrangements appears to be universal in eukaryotes (Velandia-Huerto et al. 2016) and at least some prokaryotes (Tremblay-Savard et al. 2015).

In summary, interactions across at least three different levels of biological structure appear to have contributed to a specific and recurring pattern of rapid, parallel evolution and functional turnover of tRNA genes in Drosophila: of physically interacting sites within tRNA tertiary structure, of co-clustered tRNA genes within genomes, and of overlapping interfaces for structurally similar and closely related tRNA-binding proteins. Further work is needed to generalize the observations we have made and to discern their causes.

Methods

Data, Ortholog Sets, and Alignments

We obtained tRNA gene sequences and annotations for 12 species of Drosophila from FlyBase (2008_07 release, McQuilton et al. 2012) on October 16, 2011. For further functional re-annotation, we re-folded and re-annotated these tRNA sequences using tRNAscan-SE 1.3.1 (Lowe and Eddy 1997) and ARAGORN 1.2.34 (Laslett and Canback 2004). The genome for Musca domestica (Scott et al. 2014) was downloaded from NBCI on October 8, 2013. The genome was annotated using predictions from tRNAscan-SE 1.3.1 (Lowe and Eddy 1997).

We identified Initiator tRNA genes using TFAM 1.3 (Tåquist et al. 2007). We aligned all tRNAs using Infernal 1.1 (Nawrocki et al. 2009) with the RFAM covariance model for tRNAs (RF00005) (Burge et al. 2012). We edited alignments manually using Seaview 4.3.4 (Gouy et al. 2010) to produce a final alignment of length 74, and mapped each site manually to Sprinzl coordinates (Sprinzl and Vassilenko 2005). We verified our coordinate mapping using tRNAdb (Jühling et al. 2009). We retained Sprinzl coordinate 20A and removed the majority of the variable arm, except Sprinzl coordinates 45 through 49, for subsequent analysis.

We downloaded ortholog sets of tRNA genes on October 18, 2011 from supplementary data available at http://gbe.oxfordjournals.org/content/2/467/suppl/DC1 (Rogers et al. 2010). To optimize ortholog sets, we calculated lengths of pruned species trees using the Bio::TreeIO module in BioPerl 1.4.0 (Stajich et al. 2002) and calculated numbers of variable and parsimoniously informative sites of alignments, as well as average fraction of pairwise differences, using the Bio::PopGen modules (Stajich and Hahn 2005) in BioPerl 1.4.0.

Analysis of Substitution Rates

All subsets of data were curated into concatenated alignments by previously published ortholog sets (Rogers et al. 2010). Ortholog sets with anticodon shift substitutions, genes of indeterminate function, or pseudogenes were removed from all analyses of substitution rates.

We estimated substitution rates with MrBayes 3.2.1 (Huelsenbeck and Ronquist 2001; Ronquist et al. 2012) using the fixed known species tree (Drosophila 12 Genomes Consortium 2007). For all runs we constrained change to the tree topology by setting rates of stochastic TBR and branch multipliers to zero probability. All Bayesian analyses were run with two simultaneous chains for 4×106 iterations, monitoring convergence of split frequency standard deviations and saved parameters every 500 iterations, and with option “ratemult = scaled.”

We computed evolutionary rates by assigning sites or site-pairs into corresponding data partitions. For structural data partitions, we considered nine structural tRNA components: acceptor stem (Sprinzl sites in Drosophila: 1–7, 67–73), D-stem (10–13, 23–26), D-loop (14–22), anticodon stem (28–32, 40–44), anticodon loop (33–39), variable arm (45–49), T-stem (50–54, 62–66), T-loop (55–61), and “other sites" (8, 9, 27, 74). Sites in “other" are not involved in base-pairs or considered part of loop structures in tRNA. We estimated rates using the General Time Reversible (GTR + I) substitution model (Lanave et al. 1984; Tavaré 1986; Rodrıǵuez et al. 1990) with Invariant sites and the Doublet(GTR) + I model with Invariant sites (Ronquist et al. 2012), allowing stationary state frequencies and all other substitution model parameters to be independent across site-partitions. Substitution rate multipliers for partitions were scaled to have an average rate of one substitution per site/site-pair over all partitions. Alignment data and MrBayes initialization scripts in NEXUS format with corresponding alignments are provided in supplementary data. The first 25% of parameter calculations were discarded as burn-in for statistical analysis. Parameter posterior probabilities were imported to R (R Core Team 2013). We used the coda package (Plummer et al. 2006) for Markov Chain Monte Carlo simulations diagnostics and the lattice package (Sarkar 2008) for multivariate analysis. All data from MCMC simulations are presented with 95% Bayesian credible intervals.

Detection of Class-Informative Features

For analysis of tRNA CIFs, we removed genes with anticodon shift substitutions, genes of indeterminate function, and pseudogenes and we further filtered genes to have tRNAscan-SE COVE scores (Lowe and Eddy 1997) of at least 50 bits. We identified Class-Informative features (CIFs) for 12 species of Drosophila using tSFM v1.0, available from https://github.com/tlawrence3/tSFM, using the Nemenman–Shafee–Bialek (NSB) Bayesian entropy estimator (Nemenman et al. 2002) for features in two or more sequences and an exact estimator (Schneider et al. 1986) otherwise. To compute the significance of paired-site CIFs in D. melanogaster, we computed permutation p-values (permuting functional class assignments over sequences that contain a CIF) for the total information of paired-sites only, and then computed Benjamini–Hochberg False Discovery Rates (Benjamini and Hochberg 1995) from those p-values.

Supplementary Information

Below is the link to the electronic supplementary material.

Acknowledgements

The authors thank Carolin Frank, Michael Cleary, Michael Colvin, Josh Phillips, David Pollock, David Liberles, and three reviewers for helpful comments and discussions. DHA was supported by a National Science Foundation grant (INSPIRE-1344279), a National Institutes of Health grant (AI127582), and a Julius Kuhn Guest Professorship at Martin Luther University in Halle-Wittenberg, United States. Computational research in this report was performed on the MERCED HPC cluster supported by the National Science Foundation, United States (ACI-1429783).

Author Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by JBP and DA. The first draft of the manuscript was written by JBP and DA. All authors read and approved the final manuscript.

Funding

DHA was supported by a National Science Foundation Grant (INSPIRE-1344279), a National Institutes of Health Grant (AI127582), and a Julius Kuhn Guest Professorship at Martin Luther University in Halle-Wittenberg, United States. Computational research in this report was performed on the MERCED HPC cluster supported by the National Science Foundation, United States (ACI-1429783).

Data Availability

All code and data required to produce results in this work have been deposited to FigShare and freely available at https://doi.org/10.6084/m9.figshare.12713705.v1.

Compliance with Ethical Standards

Conflict of interest

The authors have no financial or proprietary interests in any material discussed in this article.

References

  1. Amrine KCH, Swingley WD, Ardell DH. tRNA signatures reveal a polyphyletic origin of SAR11 strains among alphaproteobacteria. PLoS Comput Biol. 2014;10:e1003454. doi: 10.1371/journal.pcbi.1003454. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Ardell DH, Andersson SGE. TFAM detects co-evolution of tRNA identity rules with lateral transfer of histidyl-tRNA synthetase. Nucleic Acids Res. 2006;34:893–904. doi: 10.1093/nar/gkj449. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Behlen LS, Sampson JR, DiRenzo AB, Uhlenbeck OC. Lead-catalyzed cleavage of yeast tRNAPhe mutants. Biochemistry. 1990;29:2515–2523. doi: 10.1021/bi00462a013. [DOI] [PubMed] [Google Scholar]
  4. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc. 1995;57:289–300. [Google Scholar]
  5. Bergman C, Ardell DH. Methods for nuclear tRNA gene predictions for 12 species in the genus Drosophila. FigShare: Journal contribution; 2014. [Google Scholar]
  6. Bermudez-Santana C, Attolini CSO, Kirsten T, et al. Genomic organization of eukaryotic tRNAs. BMC Genomics. 2010;11:270. doi: 10.1186/1471-2164-11-270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Brown JR, Doolittle WF. Root of the universal tree of life based on ancient aminoacyl-tRNA synthetase gene duplications. Proc Natl Acad Sci USA. 1995;92:2441–2445. doi: 10.1073/pnas.92.7.2441. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Burge SW, Daub J, Eberhardt R, Tate J, Barquist L, Nawrocki EP, Eddy SR, Gardner PP, Bateman A. Rfam 11.0: 10 years of RNA families. Nucleic Acids Res. 2012;41:D226–D232. doi: 10.1093/nar/gks1005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Cedergren RJ, Sankoff D, LaRue B, Grosjean H. The evolving tRNA molecule. CRC Crit Rev Biochem. 1981;11:35–104. doi: 10.3109/10409238109108699. [DOI] [PubMed] [Google Scholar]
  10. Chan PP, Lin BY, Mak AJ, Lowe TM. tRNAscan-SE 20: Improved detection and functional classification of transfer RNA genes. bioRxiv. 2019 doi: 10.1101/614032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Collins-Hed AI, Ardell DH. Match fitness landscapes for macromolecular interaction networks: Selection for translational accuracy and rate can displace tRNA-binding interfaces of non-cognate aminoacyl-tRNA synthetases. Theor Popul Biol. 2019;129:68–80. doi: 10.1016/j.tpb.2019.03.007. [DOI] [PubMed] [Google Scholar]
  12. Cusack S. Aminoacyl-tRNA synthetases. Curr Opin Struct Biol. 1997;7:881–889. doi: 10.1016/s0959-440x(97)80161-3. [DOI] [PubMed] [Google Scholar]
  13. dos Reis M, Savva R, Wernisch L. Solving the riddle of codon usage preferences: a test for translational selection. Nucleic Acids Res. 2004;32:5036–5044. doi: 10.1093/nar/gkh834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Drosophila 12 Genomes Consortium Evolution of genes and genomes on the Drosophila phylogeny. Nature. 2007;450:203–218. doi: 10.1038/nature06341. [DOI] [PubMed] [Google Scholar]
  15. Eigen M, Lindemann BF, Tietze M, Winkler-Oswatitsch R, Dress A, von Haeseler A. How old is the genetic code? Statistical geometry of tRNA provides an answer. Science. 1989;244:673–679. doi: 10.1126/science.2497522. [DOI] [PubMed] [Google Scholar]
  16. Eriani G, Delarue M, Poch O, Gangloff J, Moras D. Partition of tRNA synthetases into two classes based on mutually exclusive sets of sequence motifs. Nature. 1990;347:203–206. doi: 10.1038/347203a0. [DOI] [PubMed] [Google Scholar]
  17. Freyhult E, Moulton V, Ardell DH. Visualizing bacterial tRNA identity determinants and antideterminants using function logos and inverse function logos. Nucleic Acids Res. 2006;34:905–916. doi: 10.1093/nar/gkj478. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Galili T, Gingold H, Shaul S, Benjamini Y. Identifying the ligated amino acid of archaeal tRNAs based on positions outside the anticodon. RNA. 2016;22:1477–1491. doi: 10.1261/rna.053777.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Giegé R, Sissler M, Florentz C. Universal rules and idiosyncratic features in tRNA identity. Nucleic Acids Res. 1998;26:5017. doi: 10.1093/nar/26.22.5017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Giegé R, Touzé E, Lorber B, Théobald-Dietrich A, Sauter C. Crystallogenesis trends of free and liganded aminoacyl-tRNA synthetases. Cryst Growth Des. 2008;8:4297–4306. [Google Scholar]
  21. Gouy M, Guindon S, Gascuel O. SeaView version 4: a multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Mol Biol Evol. 2010;27:221–224. doi: 10.1093/molbev/msp259. [DOI] [PubMed] [Google Scholar]
  22. Hasegawa M, Kishino H, Yano T. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol. 1985;22:160–174. doi: 10.1007/BF02101694. [DOI] [PubMed] [Google Scholar]
  23. Hennig W, et al. Insect Phylogeny. Hoboken: Wiley; 1981. [Google Scholar]
  24. Holbrook SR, Sussman JL, Warrant RW, Church GM, Kim SH. RNA-ligand interactions:(I) magnesium binding sites in yeast tRNAPhe. Nucleic Acids Res. 1977;4:2811–2820. doi: 10.1093/nar/4.8.2811. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Holley RW. Structure of an alanine transfer ribonucleic acid. J Am Med Assoc. 1965;194:868–871. [PubMed] [Google Scholar]
  26. Holley RW, Apgar J, Everett GA, Madison JT, Marquisee M, Merrill SH, Penswick JR, Zamir A. Structure of a ribonucleic acid. Science. 1965;147:1462–1465. doi: 10.1126/science.147.3664.1462. [DOI] [PubMed] [Google Scholar]
  27. Huelsenbeck JP, Ronquist F. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics. 2001;17:754–755. doi: 10.1093/bioinformatics/17.8.754. [DOI] [PubMed] [Google Scholar]
  28. Humphrey W, Dalke A, Schulten K. VMD: visual molecular dynamics. J Mol Graph. 1996;14:33-8-27-8. doi: 10.1016/0263-7855(96)00018-5. [DOI] [PubMed] [Google Scholar]
  29. Jack A, Ladner JE, Rhodes D, Brown RS, Klug A. A crystallographic study of metal-binding to yeast phenylalanine transfer RNA. J Mol Biol. 1977;111:315–328. doi: 10.1016/s0022-2836(77)80054-5. [DOI] [PubMed] [Google Scholar]
  30. Jühling F, Mörl M, Hartmann RK, Sprinzl M, Stadler PF, Pütz J. tRNAdb 2009: compilation of tRNA sequences and tRNA genes. Nucleic Acids Res. 2009;37:D159–D162. doi: 10.1093/nar/gkn772. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Kelly P, Hadi-Nezhad F, Liu DY, Lawrence TJ, Linington RG, Ibba M, Ardell DH. Targeting tRNA-synthetase interactions towards novel therapeutic discovery against eukaryotic pathogens. PLoS Negl Trop Dis. 2020;14:e0007983. doi: 10.1371/journal.pntd.0007983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D. The Human Genome Browser at UCSC. Genome Res. 2002;12:996–1006. doi: 10.1101/gr.229102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Lanave C, Preparata G, Saccone C, Serio G. A new method for calculating evolutionary substitution rates. J Mol Evol. 1984;20:86–93. doi: 10.1007/BF02101990. [DOI] [PubMed] [Google Scholar]
  34. Laslett D, Canback B. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res. 2004;32:11–16. doi: 10.1093/nar/gkh152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Lawrence TJ, Amrine KC, Swingley WD, Ardell DH. tRNA functional signatures classify plastids as late-branching cyanobacteria. BMC Evol Biol. 2019;19:224. doi: 10.1186/s12862-019-1552-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25:955–964. doi: 10.1093/nar/25.5.955. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Marck C, Grosjean H. tRNomics: analysis of tRNA genes from 50 genomes of Eukarya, Archaea, and Bacteria reveals anticodon-sparing strategies and domain-specific features. RNA. 2002;8(10):1189–1232. doi: 10.1017/s1355838202022021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. McQuilton P, St Pierre SE, Thurmond J, et al. FlyBase 101– the basics of navigating FlyBase. Nucleic Acids Res. 2012;40:D706–D714. doi: 10.1093/nar/gkr1030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Nawrocki EP, Kolbe DL, Eddy SR. Infernal 1.0: inference of RNA alignments. Bioinformatics. 2009;25:1335–1337. doi: 10.1093/bioinformatics/btp157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Nei M, Rooney AP. Concerted and birth-and-death evolution of multigene families. Annu Rev Genet. 2005;39:121–152. doi: 10.1146/annurev.genet.39.073003.112240. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Nemenman I, Shafee F, Bialek W (2002) Entropy and inference, revisited. English (US). In: Advances in Neural Information Processing Systems 14 - Proceedings of the 2001 Conference, NIPS 2001. Advances in Neural Information Processing Systems. Neural Information Processing Systems Foundation.
  42. Phillips JB (2013) Evolution of eukaryotic transfer ribonucleic acid. Dissertation, University of California, Merced. ProQuest ID: Phillips_ucmerced_1660D_10055. Merritt ID: ark:/13030/m5n600vj. https://escholarship.org/uc/item/8np368qs accessed 2 Jan 2021
  43. Plummer M, Best N, Cowles K, Vines K. CODA: convergence diagnosis and output analysis for MCMC. R News. 2006;6:7–11. [Google Scholar]
  44. R Core Team . R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2013. [Google Scholar]
  45. Rodriguez F, Oliver JL, Marin A, Medina JR. The general stochastic model of nucleotide substitution. J Theor Biol. 1990;142:485–501. doi: 10.1016/s0022-5193(05)80104-3. [DOI] [PubMed] [Google Scholar]
  46. Rogers HH, Griffiths-Jones S. tRNA anticodon shifts in eukaryotic genomes. RNA. 2014;20:269–281. doi: 10.1261/rna.041681.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Rogers HH, Bergman CM, Griffiths-Jones S. The evolution of tRNA genes in Drosophila. Genome Biol Evol. 2010;2:467–477. doi: 10.1093/gbe/evq034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, Höhna S, Larget B, Liu L, Suchard MA, Huelsenbeck JP. MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol. 2012;61:539–542. doi: 10.1093/sysbio/sys029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Sabi R, Volvovitch Daniel R, Tuller T. stAIcalc: tRNA adaptation index calculator based on species-specific weights. Bioinformatics. 2017;33:589–591. doi: 10.1093/bioinformatics/btw647. [DOI] [PubMed] [Google Scholar]
  50. Saks ME, Sampson JR, Abelson J. Evolution of a transfer RNA gene through a point mutation in the anticodon. Science. 1998;279:1665–1670. doi: 10.1126/science.279.5357.1665. [DOI] [PubMed] [Google Scholar]
  51. Sarkar D. Lattice: multivariate data visualization with R. New York: Springer; 2008. [Google Scholar]
  52. Schneider TD, Stormo GD, Gold L, Ehrenfeucht A. Information content of binding sites on nucleotide sequences. J Mol Biol. 1986;188:415–431. doi: 10.1016/0022-2836(86)90165-8. [DOI] [PubMed] [Google Scholar]
  53. Scott JG, et al. Genome of the house fly, Musca domestica L., a global vector of diseases with adaptations to a septic environment. Genome Biol. 2014;15:466. doi: 10.1186/s13059-014-0466-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Sethi A, Eargle J, Black AA, Luthey-Schulten Z. Dynamical networks in tRNA:protein complexes. Proc Natl Acad Sci USA. 2009;106:6620–6625. doi: 10.1073/pnas.0810961106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Shi H, Moore PB. The crystal structure of yeast phenylalanine tRNA at 1.93 Å resolution: a classic structure revisited. RNA. 2000;6:1091–1105. doi: 10.1017/s1355838200000364. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Sprinzl M, Horn C, Brown M, Ioudovitch A, Steinberg S. Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Res. 1998;26:148–153. doi: 10.1093/nar/26.1.148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Stajich JE, Hahn MW. Disentangling the effects of demography and selection in human history. Mol Biol Evol. 2005;22:63–73. doi: 10.1093/molbev/msh252. [DOI] [PubMed] [Google Scholar]
  58. Stajich JE, Block D, et al. The BioPerl toolkit: Perl modules for the life sciences. Genome Res. 2002;12:1611–1618. doi: 10.1101/gr.361602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Tamaki S, Tomita M, Suzuki H, Kanai A. Systematic analysis of the binding surfaces between tRNAs and their respective aminoacyl tRNA synthetase based on structural and evolutionary data. Front Genet. 2018;8:227. doi: 10.3389/fgene.2017.00227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Tamura K, Subramanian S, Kumar S. Temporal patterns of fruit fly (Drosophila) evolution revealed by mutation clocks. Mol Biol Evol. 2003;21:36–44. doi: 10.1093/molbev/msg236. [DOI] [PubMed] [Google Scholar]
  61. Tåquist H, Cui Y, Ardell DH. TFAM 1.0: an online tRNA function classifier. Nucleic Acids Res. 2007;35:W350–W353. doi: 10.1093/nar/gkm393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Tavaré S. Some probabilistic and statistical problems in the analysis of DNA sequences. Lectures Math Life Sci. 1986;17:57–86. [Google Scholar]
  63. Tremblay-Savard O, Benzaid B, Lang BF, El-Mabrouk N. Evolution of tRNA repertoires in Bacillus inferred with OrthoAlign. Mol Biol Evol. 2015;32:1–7. doi: 10.1093/molbev/msv029. [DOI] [PubMed] [Google Scholar]
  64. Tweedie S, et al. FlyBase: enhancing Drosophila Gene Ontology annotations. Nucleic Acids Res. 2009;37:D555–D559. doi: 10.1093/nar/gkn788. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Velandia-Huerto CA, Berkemer SJ, Hoffmann A, Retzlaff N, Romero Marroquín LC, Hernández-Rosales M, Stadler PF, Bermúdez-Santana CI. Orthologs, turn-over, and remolding of tRNAs in primates and fruit flies. BMC Genomics. 2016;17:617. doi: 10.1186/s12864-016-2927-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Wiegmann BM, Yeates DK, Thorne JL, Kishino H. Time flies, a new molecular time-scale for brachyceran fly evolution without a clock. Syst Biol. 2003;52:745–756. [PubMed] [Google Scholar]
  67. Wolfson AD, LaRiviere FJ, Pleiss JA, Dale T, Asahara H, Uhlenbeck OC (2001) tRNA Conformity. In: The ribosome. CSHL Press, pp 185–193 [DOI] [PubMed]
  68. Zamudio GS, José MV. Identity elements of tRNA as derived from information analysis. Orig Life Evol Biosph. 2018;48(1):73–81. doi: 10.1007/s11084-017-9541-6. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

All code and data required to produce results in this work have been deposited to FigShare and freely available at https://doi.org/10.6084/m9.figshare.12713705.v1.


Articles from Journal of Molecular Evolution are provided here courtesy of Springer

RESOURCES