Skip to main content
RNA Biology logoLink to RNA Biology
. 2020 Apr 22;17(7):1001–1008. doi: 10.1080/15476286.2020.1748922

The rare lncRNA GOLLD is widespread and structurally conserved among Mycobacterium tRNA arrays

Sergio Morgado a,✉,*, Deborah Antunes b,*, Ernesto Caffarena c, Ana Carolina Vicente a
PMCID: PMC7549688  PMID: 32275844

ABSTRACT

Noncoding RNA (ncRNA) genes produce transcripts involved in a wide range of functions, including catalytic and regulatory functions. Besides, some transcripts have highly complex structures that may impact their activities. Among the largest bacterial ncRNAs, there is the rare GOLLD RNA, which is associated with tRNA genes and supposed to be chromosome- and phage-encoded in specialized groups of bacteria, including those from Lactobacillales and Actinomycetales orders. The only GOLLD structure was inferred from a variety of sequences, including many marine metagenomes. To explore GOLLD RNA in bacterial genomes, we mined the GOLLD gene in thousands of Mycobacterium and virus genomes using Infernal software. We identified this gene in 350 mycobacteria, including megaplasmids, and 39 bacteriophages, mainly in the genomic context of tRNA arrays. Mycobacterium GOLLD genes presented a high diversity and were distributed in three phylogenetic groups: (i) Mycobacterium exclusive; (ii) Mycobacterium and mycobacteriophages; and (iii) mycobacteriophage exclusive. We also determined the GOLLD secondary structure of each group using R2 R software based on GOLLD alignments generated by Infernal software. All GOLLD groups displayed a 3ʹ half conserved structure, including utter E-loops pseudoknots substructures, also shared by non-Mycobacterium GOLLD while the 5ʹ half motif was different among the groups. Here, we showed that the lncRNA GOLLD is widespread in Mycobacterium within tRNA arrays and corroborated the previously predicted GOLLD secondary structure.

KEYWORDS: Mycobacterium, HNH endonuclease, lncRNA GOLLD, tRNA array, GOLLD secondary structure

Introduction

In recent years, the association of ncRNA with adaptation processes in all domains of life led to an interest in these molecules with a consequent increase in surveillance and identification of genes encoding ncRNAs [15]. In the bacteria domain, many ncRNAs are small RNAs (~50 – 300 nt), acting mainly on post-transcriptional events [6,7]. On the other hand, longer ncRNAs, greater than 200 nt, are rare and involved in information transfer, metabolism, and physiological adaptation [4]. Although some ncRNAs present diversity in the nucleotide sequence, they share a highly conserved secondary structure, suggesting that these RNAs are selected by structure rather than by gene sequence [810].

The GOLLD RNA (Giant, Ornate, Lake- and Lactobacillales-Derived) is among the largest bacterial ncRNAs (~800 nt) so far identified [11]. It presents a conserved complex secondary structure characterized by distinct domains: the highly conserved and prevalent 3ʹ domain; and the variable 5ʹ domain. The GOLLD RNA gene was initially identified in bacteria from Bacteroidetes, Firmicutes, and Verrucomicrobia phyla, besides environmental metagenomes. Interestingly, in most of these bacteria, the GOLLD gene is surrounded by tRNA genes [4]. In 2019, we performed searches in the Rfam database [12] and observed that GOLLD sequences had been assigned in six bacteria phyla: Actinobacteria (n = 5), Bacteroidetes (n = 2), Firmicutes (n = 7), Nitrospirae (n = 1), Proteobacteria (n = 3), and Verrucomicrobia (n = 1). Besides bacteria, GOLLD sequences were predicted to bacteriophages, including Streptococcus (n = 5), Mycobacterium (n = 3), Caulobacter (n = 1), and Acinetobacter (n = 1) phages. Although GOLLD had been identified in many bacterial phyla and bacteriophages, several aspects of its functionality still need to be elucidated [4].

So far, there was only one Mycobacterium (Mycobacterium mageritense CIP 104973) assigned in the Rfam database as harbouring the GOLLD gene. However, a 2019 study exploring tRNA and ncRNA genes in Mycobacterium clades revealed GOLLD RNA genes in three other Mycobacterium species [13]. In this way, our group had identified genes encoding ncRNAs in several Mycobacterium species and mycobacteriophages, some of them annotated as GOLLD RNA [14]. Here, based on these findings and aiming to explore the global scenario of GOLLD RNA in the Mycobacterium genus, we performed a comprehensive search to GOLLD primary sequence in Mycobacterium and virus genomes available in public databases. We characterized the GOLLD gene distribution and its genetic context in several other Mycobacterium species and established the genetic relationship among the different GOLLD alleles. Moreover, it was determined the secondary structure of each GOLLD group. Despite the GOLLD gene allelic diversity, its overall secondary structure is conserved. Moreover, in vitro experiments showed the expression of the GOLLD RNA gene carried by the Mycobacterium sp. CBMA213 strain.

Results

GOLLD detection

We found GOLLD sequences in 389 genomes belonging to Mycobacterium and viruses, out of which 350 were in Mycobacterium and 39 in bacteriophage genomes (Supplementary File 1). Applying the cmsearch program, it was revealed an unexpectedly high prevalence of GOLLD sequences in this genus of bacteria. The raw results of the cmsearch output are provided as supplemental data (Supplementary File 2). Interestingly, the GOLLD RNA gene was present in distinct bacteria genomic compartments, chromosome and plasmid, being mainly located (98%) within regions with high tRNA gene density (Supplementary Table 1), termed tRNA arrays [14]. Since most of the GOLLD sequences were identified in tRNA arrays, these regions were tested to evaluate possible false-positive GOLLD identifications due to their sequence composition. In this negative control test, the tRNA array regions were extracted for each GOLLD-positive genome, dinucleotide shuffled and submitted to gene prediction. In this way, Infernal did not predict any GOLLD sequence in this test, just as prokka, which did not predict any tRNA gene. Thus, these results strengthen the confidence in the predictions of these genes in the original sequences. The predicted size of the GOLLD gene ranged from ~ 540 to 800 bp, and most genomes had a single copy of this gene, although four genomes had two copies (each GOLLD gene in distinct tRNA arrays). Many Mycobacterium species, besides those previously mentioned, were revealed carrying GOLLD RNA genes: Mycobacterium abscessus complex (including M. abscessus – sensu stricto, Mycobacterium massiliense, and Mycobacterium bolletii), Mycobacterium alsense, Mycobacterium chelonae, Mycobacterium conceptionense, Mycobacterium immunogenum, Mycobacterium koreense, and Mycobacterium saopaulense, besides 12 Mycobacterium spp. genomes. M. abscessus complex and their sister species (M. chelonae, M. immunogenum, and M. saopaulense) encompassed most of the species harbouring GOLLD gene sequences, 334/350 genomes.

GOLLD gene phylogeny

We performed a phylogenetic analysis based on a representative set of the 389 GOLLD gene sequences (those with 100% identity were filtered), in addition to Rfam sequences designated as GOLLD. This analysis revealed three main clusters, two comprising mainly GOLLD sequences of marine metagenomes, and the third presenting subclusters encompassing sequences of Mycobacterium and Gordonia genera (Actinobacteria phylum), and Streptococcus genus (Firmicutes phylum) (Fig. 1). Notably, the Mycobacterium GOLLD sequences were clustered into three subclusters, including (1) Mycobacterium exclusive, (2) Mycobacterium and mycobacteriophage, and (3) mycobacteriophage exclusive. This phylogenetic analysis showed some deep branches supported by low bootstrap values, reflecting the overall diversity of the GOLLD nucleotide sequence despite its conserved secondary structure, as occur with other structured ncRNA molecules. GOLLD RNA gene sequences from Gordonia strains were also observed within the Mycobacterium cluster. However, their genomic context is quite distinct, presenting the GOLLD genes near few tRNA genes, whereas tRNA arrays surrounded most of the Mycobacterium GOLLD genes.

Figure 1.

Figure 1.

GOLLD gene phylogenetic tree. Coloured labels represent the genomes with more than one GOLLD gene (red) and with the GOLLD gene within a plasmid (orange). Bootstrap values ≥70 are indicated by the green circles in the branches.

GOLLD genetic context

In most Mycobacterium genomes, the GOLLD gene was in the context of tRNA arrays and close to the HNH endonuclease genes, except in few genomes. Even so, in these exceptions, it was possible to identify tRNA arrays without the GOLLD gene, as in the genomes of M. mageritense CIP 104973 and M. sp. SWH-M3, in which one or two tRNA genes flanked the GOLLD gene. These latter genomes share a similar GOLLD genetic context, and a transposase gene is predicted to be inserted into the GOLLD variable region of M. mageritense (Fig. 2). Moreover, M. mageritense tRNA array shares a similar organization with many others genomes, including M. sp. CBMA213, in addition to an HNH endonuclease: [… tRNAThr(tgt) – tRNAThr(ggt) – tRNAGly(gcc) – tRNALeu(cag/tag) – tRNAAsn(gtt) – tRNAAla(tgc) – HNH endonuclease …]. However, the tRNA array of M. mageritense lacks the GOLLD RNA gene (Fig. 3). These are evidence of insertion/deletion events involving the GOLLD gene. In the M. abscessus-chelonae complex strains, it was possible to identify the GOLLD RNA gene in two distinct genetic contexts based on the tRNA array composition: (i) M. abscessus-chelonae with other mycobacterial species, and (ii) M. abscessus-chelonae with mycobacteriophages (Figs. 45).

Figure 2.

Figure 2.

Comparison of GOLLD genetic context between M. mageritense and M. sp. SWH-M3. The transposase gene predicted within the GOLLD gene of M. mageritense is located at position 175–321 nt of the gene.

Figure 3.

Figure 3.

Comparison of tRNA array segment between M. mageritense and M. sp. CBMA213.

Figure 4.

Figure 4.

Comparison of GOLLD genetic context between M. sp. CBMA213 and M. abscessus 1058. The tRNA gene predicted within the GOLLD gene of M. abscessus 1058 is located at position 68–145 nt of the gene.

Figure 5.

Figure 5.

Comparison of the GOLLD genetic context between M. chelonae B3 S15 and mycobacteriophage Bongo.

GOLLD in bacteriophage genomes

Thirty-nine bacteriophages were identified harbouring the GOLLD gene, isolated from several bacterial genera: Mycobacterium (n = 18), Caulobacter (n = 10), Streptococcus (n = 8), Acinetobacter (n = 2), and Roseobacter (n = 1). All these phages are members of Caudovirales order and Siphoviridae family, except Acinetobacter phages, belonging to the Myoviridae family. GOLLD position in these genomes was variable, being within tRNA arrays and adjacent (or not) to few tRNA genes.

The mycobacteriophages belong to two phylogenetic groups: cluster M (n = 11) and cluster R (n = 7). Interestingly, the phylogeny of the mycobacteriophage GOLLD genes revealed two clusters that correspond to M and R clusters (Fig. 1). Moreover, the predicted size of their GOLLD gene varied between ~ 520–760 bp and ~ 570 bp for cluster M and R, respectively. In the cluster M mycobacteriophages, the GOLLD gene was within tRNA arrays next to an HNH endonuclease gene. In contrast, cluster R mycobacteriophages do not harbour tRNA genes, and a set of protein-encoding genes is close to the GOLLD gene (O-methyltransferase, DNA helicase, exonuclease, AAA-ATPase and DNA polymerase genes, besides hypothetical ones) (Fig. 6). The GOLLD genetic context of the Streptococcus phages is characterized by a region with phage structural genes, as terminase, portal, and head morphogenesis. Besides, JX01, KYGO9, and phiARI0746 phages also harbour a tRNASer(gct) next to the GOLLD gene, while the others do not (Fig. 7).

Figure 6.

Figure 6.

Comparison of GOLLD genetic context between mycobacteriophages Bongo (cluster M) and Papyrus (cluster R).

Figure 7.

Figure 7.

Comparison of the GOLLD genetic context between Streptococcus phages.

GOLLD secondary structure

We built the secondary structure for the three subclusters from Mycobacterium GOLLD, including a consensus for each of them and a general one. The arrangement described for GOLLD RNAs includes many substructures and different 5ʹ and 3ʹ motifs. The 3ʹ half is highly conserved, contains two E-loops and four pseudoknots substructures, and is found throughout GOLLD RNAs while the 5ʹ half appears to diverge into variants where some substructures are absent or substituted, including three GNRA tetraloops and one pseudoknot substructures [11]. Following the report for GOLLD, all subclusters display a 3ʹ half conserved structure, including utter E-loops pseudoknots substructures (Fig. 8 and Supplementary Figures S1-S3). However, the 5ʹ half motif is different among the subclusters. Only subcluster 2 includes the first multi-stem junction of the 5ʹ half motif conserved, containing one GNRA tetraloop and pseudoknot substructures (Supplementary Figure S2). Alignments and consensus secondary structures of Mycobacterium GOLLD RNAs are provided as supplemental data (Supplementary File 3).

Figure 8.

Figure 8.

Consensus sequence and secondary structure model for Mycobacterium GOLLD RNAs. The conserved first multi-junction of the 5ʹ half motif (GNRA tetraloop and pseudoknot) was extracted from information given by subcluster 2.

GOLLD RT-PCR

Based on the M. sp. CBMA213 strain and using real-time RT-PCR, we showed that the GOLLD RNA and HNH endonuclease genes, both embedded in a tRNA array within a plasmid, are significantly expressed at a quite same level (Supplementary Figure S4).

Discussion

So far, GOLLD is one of the largest ncRNAs identified, having been observed in only two out of a dozen bacterial phyla. Weinberg et al. (2009) explored this lncRNA in bacteria from Lactobacillales order (Firmicutes phylum) and metagenomes, determining its secondary structure. Recently, GOLLD was found in three Mycobacterium species (Actinobacteria phylum) [13]. To unravel the current scenario of the GOLLD RNA gene in the Mycobacterium genus, we mined thousands of Mycobacterium and virus genomes. As a result, we showed that this supposed rare lncRNA is widespread among Mycobacterium species and bacteriophage genomes, presenting a high number of alleles. Despite this diversity, we demonstrated that the overall GOLLD RNA structure displayed a common conserved architecture. Here we also introduced an expanded list of organisms from several other bacterial phyla that harbour this type of ncRNA. Currently, in the Rfam, GOLLD sequences are assigned to 19 bacteria and ten viruses, besides marine metagenomes. These bacteria belong to 18 species from Firmicutes, Bacteroidetes, Verrucomicrobia, Actinobacteria, Proteobacteria, and Nitrospirae phyla. The Actinobacteria phylum is represented by only four Gordonia and one Mycobacterium genomes. The 8/10 viruses belonged to the Caudovirales order. This study revealed GOLLD sequences in at least 350 Mycobacterium genomes, encompassing 12 spp. and eight species, most of them from the M. abscessus complex and its sister species. The prevalence of GOLLD in the M. abscessus complex genomes may be biased due to the clinical impact of these species and the consequent abundance of genomes from this species [15]. Despite the large number of mycobacterial genomes carrying the GOLLD gene, it is not possible to state that there is an overrepresentation in the genus, since there has been no extensive search for GOLLD in other organisms. Concerning GOLLD sequences associated with viral genomes, they were enlarged from 10 to 39, updating the list of bacteriophages from hosts already known to harbour GOLLD, as Mycobacterium, Caulobacter, Acinetobacter, and Streptococcus phages, besides a new phage, Roseobacter phage.

Here, by the first time, the GOLLD gene was identified and characterized in the context of plasmids, which in addition to bacteriophages, may be vectors spreading this gene among organisms. It has also been observed that the GOLLD gene is often close to tRNA genes, as already reported [4,13]. Furthermore, in this study, considering the data set used, we found that in most Mycobacterium genomes the GOLLD gene is located within tRNA arrays carried on chromosomes, plasmids, and mycobacteriophages. Previously, it has been shown that tRNA arrays are prevalent in the Mycobacterium genus [14], and their dispersion is associated with horizontal gene transfer [14,16]. Besides being tRNA arrays vectors, bacteriophages and plasmids are also involved in the GOLLD gene dissemination within Mycobacterium. Therefore, this gene may have a wider prevalence than previously assumed [11]. Some tRNA arrays carrying the GOLLD gene presented a similar tRNA isotype synteny with tRNA arrays lacking the GOLLD gene, being another evidence of the GOLLD gene mobility. GOLLD RNA could be itself a selfish genetic element, as already suggested [4]. tRNA arrays may be organized as operons, and in this case, the presence of the GOLLD gene in this structure would favour its expression. The operon organization of a tRNA array was shown to influence tRNA expression at high rates [17]. GOLLD RNA expression has already been reported in Lactobacillus brevis phage [11] and Mycobacterium aubagnense [13], and has also been demonstrated here in Mycobacterium sp. CBMA213 strain. Interestingly, the HNH endonuclease gene, located near the GOLLD gene, is transcribed at the same level as GOLLD in M. sp. CBMA213, supporting the operon organization of this structure. However, this GOLLD expression cannot be considered as an indicative of functional role, requiring further studies to determine it.

tRNA and HNH endonuclease genes are occasionally found embedded in some large RNA genes, as GOLLD and HEARO, respectively [11,13]. Here we showed a tRNA gene within the GOLLD gene of some M. abscessus and a transposase gene inside the GOLLD variable region of M. mageritense. Although we did not observe predicted HNH endonuclease genes within GOLLD, as in HEARO ncRNA, they were often observed close to GOLLD genes in tRNA arrays. Curiously, HNH endonuclease, and transposase genes play a role in horizontal transfer events [18]. As evidence of their independent mobility, the transposase gene, embedded within M. mageritense GOLLD, has a gene orientation opposite to the GOLLD gene. Therefore, the eventual presence of these genes within large RNA genes could be an archaeological trait showing that insertion/deletion events used to occur in this genomic context.

Among viruses, the GOLLD RNA gene was identified in bacteriophages infecting bacteria of five genera and three phyla (Actinobacteria, Firmicutes, and Proteobacteria). As GOLLD has already been identified in several environmental metagenomes, it seems unlikely that GOLLD RNA is limited to this range of viruses. As this study was based on an in silico approach using covariance models constructed with already identified GOLLD sequences, it was expected that only similar GOLLD sequences would be identified in the data set analysed. Therefore, there may be other GOLLD variants to be revealed in the virus as well as in Mycobacterium genomes. This hypothesis is supported by observing the GOLLD gene tree (Fig. 1), as it has some deep branches supported by low bootstrap values. So far, all viruses that have GOLLD genes belong to the Caudovirales order. Interestingly, some of these viruses carry tRNA genes and tRNA arrays [19]. As a matter of fact, concerning the cluster M mycobacteriophages, they were supposed to have disseminated the tRNA arrays in a set of M. abscessus complex genomes, and consequently, the GOLLD gene located within these arrays [14]. Therefore, the association of tRNA array with the GOLLD gene within Mycobacterium can be just the effect of their dissemination by mycobacteriophages. In the original study describing GOLLD RNA, it was found to be co-expressed with phage particles and presumably crucial to the phage lytic process [11]. Here, we observed that the Streptococcus phages presented the GOLLD gene close to their structural genes. The proximity to highly expressed genes could favour high levels of GOLLD transcripts.

The same-class-belonging ncRNAs share precise sequences, and structural features have been conserved throughout various evolutionary processes [20]. With GOLLD, Weinberg et al. (2009) formerly described that three sequences, including Lactobacillus brevis, share the first multistem junction of the 5ʹ half and the complete 3ʹ half motifs [11]. Here, we corroborate these outcomes, besides enhancing the data set size by 49 new Mycobacterium sequences, sharing the same structural architecture. Additionally, all Mycobacterium sequences reported here present a complete conserved 3ʹ half motif in despite the allelic diversity. These findings reinforce that these structures may be directly involved in tertiary RNA contacts and provide robustness to a probable GOLLD RNA functionality, opening an avenue for GOLLD research in other organisms.

Material and methods

Sequences

GOLLD sequences were sought in 7670 complete and draft genomes of Mycobacterium genus and 14358 virus sequences retrieved from National Centre for Biotechnology Information (NCBI) ftp site (Dez-2017). Additionally, we investigated 2910 sequences from Actinobacteriophages (https://phagesdb.org/) (Fev-2019).

GOLLD identification and phylogeny

GOLLD sequences were identified using Infernal v1.1.2 [9] package, and its alignment, in Stockholm format, was obtained from Weinberg et al. (2009). The Infernal cmbuild and cmcalibrate tools were used to build and calibrate the covariance model (CM), respectively. This CM was used in the cmsearch tool, with default parameters, to search for GOLLD genes in the genome dataset, and sequences above the inclusion threshold were selected. We decided to generate an in-house CM given that the one provided by Rfam contains only the conserved 3ʹ domain (consensus length of 438 bp) instead of the complete sequence (with 5ʹ and 3ʹ domains), provided by Weinberg et al. (2009). Thus, we could perform analyses with longer GOLLD sequences (consensus length of 613 bp). The false-positive rate of the gene prediction was assessed by dinucleotide shuffling of the tRNA array regions using fasta-dinucleotide-shuffle program from the MEME suite [21].

A representative set of sequences was used to generate the gene tree. This set was made filtering the redundant sequences (100% identity) with CD-HIT-EST v4.7 [22]. After this filtering, sequences were aligned using MAFFT v7.271 [23] and submitted to the Seaview v4.7 [24] to generate the maximum likelihood tree using GTR substitution model and 100 bootstrap replications. All processes were carried out using default parameters. The GOLLD gene tree was drawn using iTOL [25], and its genetic context was annotated using Prokka v1.12 [26] with bacterial genetic code and – rfam option. EasyFig [27] and Inkscape (http://www.inkscape.org) were used to create figures for the genetic context.

Secondary structure analysis

Secondary structure was constructed for each Mycobacterium GOLLD group identified from the phylogenetic analysis. The Mycobacterium GOLLD alignment and secondary structure patterns were built with Infernal cmalign, with default parameters, and refined manually. Secondary structure models were drawn using R2 R software v1.0.6 [2] and Inkscape (http://www.inkscape.org).

Bacterial strain and growing conditions

The Mycobacterium sp. CBMA213 strain employed in this study for in vitro analyzes was isolated from Atlantic Forest soil and deposited in the Bacteria Collection of Environment and Health (CBAS, Fiocruz Institute-Brazil). This strain was grown in Tryptic Soy Broth (TSB) medium for five days at 25°C.

GOLLD transcription analysis

The total RNA from M. sp. CBMA213 strain was isolated using NucleoSpin RNA II kit (Macherey-Nagel) following the manufacturer’s instructions. The isolated RNA was treated with Turbo DNA-free reagent (Ambion, Applied Biosystems) to eliminate contaminant genomic DNA. RNA samples were quantified using a NanoDrop ND 1000 spectrophotometer and submitted to cDNA synthesis with SuperScript III RNase H-reverse transcriptase (Invitrogen). The produced cDNA was used as a template in real-time RT-PCR assay using Power-SYBR Green PCR Master Mix (Applied Biosystems). The cycling was performed at 95°C for 10 min; followed by 40 cycles of denaturation at 95°C for 15 s and annealing/extension at 60°C for 60 s. The GOLLD primers aimed at the conserved GOLLD 3ʹ domain, being: TGAAGCATTCTGTGGCTCAA (forward) and CTTGGTAGCACGTCGGATTT (reverse); while the HNH endonuclease primers were: TCGCTGATGCTGGTGAGATA (forward) and GCCGACGAACTGACATTGA (reverse).

Supplementary Material

Supplemental Material

Acknowledgments

We are particularly grateful to Dr. Érica Fonseca for supporting the in vitro assays.

Funding Statement

This work was supported by Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), Coordenação de Aperfeiçoamento de Pessoal de Nível Superior -Brasil (CAPES) - Finance Code 001, and Oswaldo Cruz Institute grants.

Disclosure of Potential Conflicts of Interest

No potential conflict of interest was reported by the authors.

Supplementary material

Supplemental data for this article can be accessed here.

References

  • [1].Yusuf D, Marz M, Stadler PF, et al. Bcheck: a wrapper tool for detecting RNase P RNA genes. BMC Genomics. 2010;11:432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].Weinberg Z, Breaker RR.. R2R–software to speed the depiction of aesthetic consensus RNA secondary structures. BMC Bioinformatics. 2011;12:3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Washietl S, Will S, Hendrix DA, et al. Computational analysis of noncoding RNAs. Wiley Interdiscip Rev RNA. 2012;3(6):759–778. . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Harris KA, Breaker RR. Large noncoding RNAs in bacteria. Microbiol Spectr. 2018;6(4). DOI: 10.1128/microbiolspec.RWR-0005-2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Gelsinger DR, DiRuggiero J. The noncoding regulatory RNA revolution in archaea. Genes (Basel). 2018;9(3):141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Storz G, Vogel J, Wassarman KM. Regulation by small RNAs in bacteria: expanding frontiers. Mol Cell. 2011;43(6):880–891. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Buskila AA, Kannaiah S, Amster-Choder O. RNA localization in bacteria. RNA Biol. 2018;11(8):1051–1060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Saito Y, Sato K, Sakakibara Y. Fast and accurate clustering of noncoding RNAs using ensembles of sequence alignments and secondary structures. BMC Bioinformatics. 2011;12(Suppl 1):S48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Nawrocki EP, Eddy SR. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics. 2013;29(22):2933–2935. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Vandivier LE, Anderson SJ, Foley SW, et al. The conservation and function of RNA secondary structure in plants. Annu Rev Plant Biol. 2016;67:463–488. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Weinberg Z, Perreault J, Meyer MM, et al. Exceptional structured noncoding RNAs revealed by bacterial metagenome analysis. Nature. 2009;462(7273):656–659. . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Kalvari I, Nawrocki EP, Argasinska J, et al. Noncoding RNA analysis using the Rfam database. Curr Protoc Bioinformatics. 2018;62(1):e51. . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Behra PRK, Pettersson BMF, Das S, et al. Comparative genomics of Mycobacterium mucogenicum and Mycobacterium neoaurum clade members emphasizing tRNA and noncoding RNA. BMC Evol Biol. 2019;19(1):124. . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [14].Morgado SM, Vicente ACP. Beyond the limits: tRNA array units in Mycobacterium genomes. Front Microbiol. 2018;9:1042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Lee MR, Sheng WH, Hung CC, et al. Mycobacterium abscessus complex infections in humans. Emerg Infect Dis. 2015;21(9):1638–1646. . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [16].Tran TT, Belahbib H, Bonnefoy V, et al. A comprehensive tRNA genomic survey unravels the evolutionary history of tRNA arrays in prokaryotes. Genome Biol Evol. 2015;8(1):282–295. . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [17].Ardell DH, Kirsebom LA. The genomic pattern of tDNA operon expression in E. coli. PLoS Comput Biol. 2005;1(1):e12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Stoddard BL. Homing endonucleases from mobile group I introns: discovery to genome engineering. Mob DNA. 2014;5(1):7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Morgado S, Vicente AC. Global in-silico scenario of tRNA genes and their organization in virus genomes. Viruses. 2019;11(2):180. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [20].Qu Z, Adelson DL. Evolutionary conservation and functional roles of ncRNA. Front Genet. 2012;3:205. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [21].Bailey TL, Johnson J, Grant CE, et al. The MEME Suite. Nucleic Acids Res. 2015;43(W1):W39–W49. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [22].Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22(13):1658–1659. [DOI] [PubMed] [Google Scholar]
  • [23].Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30:772–780. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].Gouy M, Guindon S, Gascuel O. SeaView version 4: a multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Mol Biol Evol. 2010;27(2):221–224. [DOI] [PubMed] [Google Scholar]
  • [25].Letunic I, Bork P. Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucleic Acids Res. 2016;44(W1):W242–W245. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014;30(14):2068–2069. [DOI] [PubMed] [Google Scholar]
  • [27].Sullivan MJ, Petty NK, Beatson SA. Easyfig: a genome comparison visualizer. Bioinformatics. 2011;27(7):1009–1010. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material

Articles from RNA Biology are provided here courtesy of Taylor & Francis

RESOURCES