Skip to main content
G3: Genes | Genomes | Genetics logoLink to G3: Genes | Genomes | Genetics
. 2025 Nov 22;16(2):jkaf283. doi: 10.1093/g3journal/jkaf283

Brain transcriptome analysis reveals novel long non-coding RNAs in Dryophytes arenicolor (Canyon Treefrog)

Héctor Herrera-Orozco 1,2, Carolina Rodríguez-Ibarra 3, Cristian Ivan Hernández-Herrera 4,5, Clara Estela Díaz-Velásquez 6,7, Felipe Vaca-Paniagua 8,9, Hibraim Adán Pérez-Mendoza 10,✉,2
Editor: A LaBella
PMCID: PMC12869070  PMID: 41271559

Abstract

Anurans possess complex genomes, making molecular research challenging, particularly in non-model species. Transcriptomics offers a powerful tool for uncovering genomic responses to environmental changes through distinct transcriptional patterns. A sizeable portion of the transcriptome consists of non-coding RNAs, with long non-coding RNAs (lncRNAs) playing key roles in gene regulation. Here, we performed de novo transcriptome assembly from brain samples of Dryophytes arenicolor across 3 life stages: Pre-metamorphic, Metamorphic Climax, and Adults. By aligning our transcriptomes to LNCipedia, we identified 4,557 previously annotated lncRNAs with potential roles in gene regulation, macromolecule biosynthesis, and chromatin organization. To detect novel lncRNAs, we implemented a bioinformatic pipeline to filter out known mRNAs, small ncRNAs, sequences with coding potential, conserved protein domains, and previously identified D. arenicolor mRNAs, identifying 4,836 putative novel lncRNAs. To explore their functional roles, we performed Weighted Gene Correlation Analysis and gene enrichment analysis, revealing that these lncRNAs may be involved in protein heterodimerization, nucleosome assembly, and post-transcriptional regulation of gene expression. To our knowledge, this is the first study to characterize lncRNAs across multiple life stages in D. arenicolor, highlighting their potential regulatory functions

Keywords: anuran, lncRNAs, transcriptomics, gene regulation, Dryophytes arenicolor

Introduction

Amphibians have the greatest variation in genome size among terrestrial tetrapods, with some salamanders (Necturus) reaching over 100 Gb, whereas anurans, such as Platyplectrum ornatum, possess genomes as small as ∼1 Gb (Lamichhaney et al. 2021). Among anurans, the average genome size is estimated at around 3 Gb, though this is based on a limited number of sequenced species and can reach up to 10 to 12 Gb in the genus Bombina (Kosch et al. 2024; Gregory 2025). Such differences in size and the high percentage of repeated content (up to 82%) make their assembly, annotation, and analysis challenging, particularly for non-model species (Kosch et al. 2025).

For organisms with large, complex genomes, transcriptomics has become a go-to tool, striking a balance between cost-effectiveness and the ability to capture dynamic genomic responses (Birol et al. 2015; Ceschin et al. 2020). RNAseq has revealed intricate expression profiles for diverse biological contexts like sex development in Hoplobatrachus rugulosus (Tang et al. 2021), resistance to body freezing during winter in Dryophytes chrysoscelis (do Amaral et al. 2020), identification of antimicrobial peptides of Boana pugnax (Liscano Martinez et al. 2020) and even the dispersal behavior of Hyla sarda (Libro et al. 2022). Although anuran transcriptomics is an active research field, little is known about the non-coding side of their transcriptional programs.

ENCODE data suggests that around 80% of the genome is actively transcribed into non-coding RNAs (Djebali et al. 2012; Stamatoyannopoulos et al. 2012). Non-coding RNAs (ncRNAs) include a diverse group of transcripts that are not translated into proteins and represent a higher level of gene regulation, as ncRNAs may interact with multiple effectors downstream (Toden et al. 2021). Based on their function, they are broadly divided into housekeeping ncRNAs—such as rRNAs and tRNAs involved in constitutive processes like protein synthesis or RNA modification (Zhang et al. 2022)—and regulatory ncRNAs, which modulate gene expression and are implicated in multiple biological processes (Nemeth et al. 2024) and diseases, including cancer (Yan and Bu 2021).

Regulatory ncRNAs can be further subdivided by size into small ncRNAs (<200 nt) and long ncRNAs (>200 nt). Long ncRNAs (lncRNAs) resemble mRNAs structurally, they are transcribed by RNA polymerase II, capped, and often polyadenylated and spliced (Choudhuri 2023). LncRNAs function as key regulators of multiple biological processes such as nuclear organization, chromatin remodeling, transcriptional regulation, splicing, and mRNA stability (Jafari-Raddani et al. 2022).

The lncRNA repertoire of anurans spans orders of magnitude, from ∼6,000 loci (Hammond et al. 2017) to >50,000 transcripts in Rhinella arenarum (Ceschin et al. 2020) and Pelobates cultriples (Liedtke et al. 2022). Although in anurans it has been shown that lncRNAs are highly involved in processes such as embryonic development (Forouzmand et al. 2017), exhibit deep conservation (Weghorst et al. 2024) and alterations to their expression patterns lead to disrupted signaling pathways (Sai et al. 2019; Qi et al. 2021), little is known about the role of lncRNAs in other stages of development during their complex life cycles; therefore, we hypothesize that they could be involved in multiple biological processes throughout development such as cell structure organization, transcription regulation, or metabolism. In this work, we aim to identify both previously annotated and putative novel lncRNAs in various stages of the life cycle of Dryophytes arenicolor and evaluate their biological roles.

Methods

Study species

D. arenicolor is a treefrog 32 to 57 mm (Snout-Vent Length) with a wide distribution that ranges from southern Utah in the United States to the northern regions of Oaxaca, Mexico (Croter 2005). The species is known to inhabit varying climates along its distribution, including pine and deciduous forests to deserts (Hernández-Herrera et al. 2024). The species presents morphological variations such as a rough skin that allows it to resist desiccation (Preest et al. 1992) and their tadpoles are resistant to elevated temperatures—up to 32 °C—(Zweifel 1926–1968). Their wide distribution and varied phenotypes make this organism a great model to evaluate gene expression patterns, as broad distributions have been correlated with species’ capacity to respond to different environmental stressors (Eriksson and Rafajlović 2022); ie phenotypic plasticity.

Study area

Fieldwork was conducted along the intermittent stream located in the Tepotzotlán Sierra (19°46′36.4″N, 99°15′7.6″W) in northern Estado de México. The climate is Temperate Sub-Humid (Cw0) with a summer rain season (García 2004). Altitude ranges between 2,350 and 2,980 ASL. Local flora is formed of grasslands and xeric scrubs, with remnants of oak forests (Hernández-Herrera and Pérez-Mendoza 2021).

Biological material

To shed light on the regulatory roles of lncRNAs in our study group, we selected 3 life cycle stages (Gosner 1960) as comparison points for the expression patterns of lncRNAs: (i) Pre-metamorphic (G25; Pre), (ii) Metamorphic Climax (G42, Climax), and (iii) Adults as the final point for the comparison (Supplementary Fig. 1). The Pre and Climax stages were selected due to the different transcriptional programs reported previously in tail reabsorption (Wang et al. 2019) and brain development (Raj et al. 2023), and the Adult stage, thanks to being recognized as the final stage of the anuran life cycle.

Tissue extraction and RNA purification

To eliminate any potential bias associated with individual samples we took 3 samples per stage—9 samples total—, each sample consisted of 3 different organisms randomly selected (pools). The organisms were submerged in a lidocaine solution (10%) to induce a cardiac arrest. Once organisms were euthanized, we obtained brain tissue for individuals of each life cycle stage and immediately preserved it in RNAlater (ThermoFisher) at −80 °C to stabilize the RNA and avoid degradation.

We put the brains of the 3 individuals on a sterile Petri dish and homogenized them using a sterile scalpel and transferred the lysate to the column of the RNAeasy Micro Kit (Qiagen); we isolated the RNA following the manufacturer's instructions. We quantified total RNA concentration using an N80 NanoPhotometer (Implen) and evaluated RNA integrity with 0.8% agarose gels (80 V, 40 min).

Library preparation and sequencing

NOVOGENE performed quality control on all samples including re-quantification, agarose integrity gels (Supplementary Fig. 2), and electropherograms to estimate RNA Integrity Numbers (Supplementary Fig. 3) and detect potential contaminants in the samples. The sequencing protocol consisted of (i) rRNA depletion, (ii) RNA fragmentation by enzymatic digestion, (iii) reverse transcription, (iv) forward library construction, (v) dUTP protocol to obtain reverse strand library, (vi) end repair and adaptor ligation, and (vii) Illumina Paired-End 150 bp sequencing using a NovaSeq 6000 sequencer.

Sequencing quality control

To evaluate the quality of the sequencing step, we used FastQC-v0.12.1 (Babraham Bioinformatics 2019) and Trimmomatic-v0.39 (Bolger et al. 2014) to eliminate adapters, low-quality bases, and unpaired reads. The filtering criteria were Phred >25 by nucleotide, mean Phred >25 in a sliding window of 4 nt mean Phred >25, and a final minimal length of 40 nt. After the cleaning step, we used FastQC to generate the quality graphs per sample and MultiQC-v1.25.1 (Ewels et al. 2016) to merge them into 1 graph.

De novo transcriptome assembly

We assembled the clean reads into a transcriptome per sample using Trinity-v2.15.1 (Haas et al. 2013). We concatenated all the individual transcriptomes per stage to create a reference transcriptome and then concatenated all the reference transcriptomes to obtain a global reference transcriptome. To remove redundancies and maintain all unique transcripts in our reference transcriptomes we used CD-HIT-v4.8.1 (Fu et al. 2012) with a 95% identity threshold to merge sequences, minimizing information loss.

Transcriptome quality control

To assess the quality of both the individual transcriptomes and the reference assemblies, we used the Trinity auxiliary toolset to quantify a variety of assembly statistics. These included transcriptome size, N50, average and median contig lengths, and Ex90N50 values. To evaluate how well the original reads were represented in each assembled transcriptome, we aligned the raw reads back to their respective de novo assemblies using Bowtie2 v2.5.4 (Langmead and Salzberg 2012). Additionally, we assessed transcriptome completeness by quantifying the presence of Benchmark Universal Single-Copy Orthologs (BUSCOs) using BUSCO v5.8.1 (Manni et al. 2021) with the Eukaryota_odb10, Vertebrata_odb10 and Tetrapoda_odb 10lineage datasets. For statistical analyses, we used the CAR package v-3.1.3 (Fox and Weisberg 2018) to check homogeneity of variances using a Levene test and found that assumptions were met (Fig. 1a, genes: F = 0.2396, P = 0.7941; transcripts: F = 0.2755, P = 0.7863; Fig. 1c, contig average: F = 0.0661, P = 0.9368; median length: F = 0.7632, P = 0.5066; N50: F = 0.1926, P = 0.8267) and we used R-v4.3.1 (R Core Team 2023 ) to perform 1-factor ANOVAs and the Tidyverse-v2.0.0 package (Wickham et al. 2019) for data wrangling and visualization.

Fig. 1.

Fig. 1.

De novo transcriptome size and quality statistics. a) Boxplot of per-sample transcriptome sizes by genes and transcripts. No significant differences were found between stages by gene (F = 1.488, P = 0.299) or transcripts (F = 1.544, P = 0.286). b) Bar plot of reference transcriptome sizes by genes and transcripts. The Climax stage showed the smallest reference transcriptome both by genes (700 K) and transcripts (1.5 M). c) Boxplots of quality stats of individual transcriptomes by mean contig length, median contig length, and N50. No significant differences were found between stages by mean contig length (F = 2.785, P = 0.139), median (F = 0.0082, P = 0.923), or N50 (F = 2.883, P = 0.133). d) Bar plot of quality stats of reference transcriptomes. The Climax stage had the lowest mean and median contig length and N50.

Known lncRNAs annotation

To annotate known lncRNAs, we aligned our transcriptomes to the LNCipedia-v5.2 dataset (Volders et al. 2019) using MMSeqs2-v15-6f452 + ds-2 (Steinegger and Söding 2017). Alignments were filtered to retain only hits with >80% (Deng et al. 2018) identity and an E-value <0.001, as previously reported (Weisman et al. 2020). We used eulerr package-v7.0.4 (Larsson and Gustafsson 2018) to generate a Venn diagram showing the overlap of annotated lncRNAs among our reference transcriptomes. To investigate the biological roles of these known lncRNAs, we conducted a functional enrichment analysis using g:Profiler2-v0.2.3 (Kolberg et al. 2020). Results with a false discovery rate (FDR) < 0.05 were considered significant.

De novo lncRNA identification

To annotate de novo lncRNAs we followed the methodology described previously (Kashyap et al. 2020; Motheramgari et al. 2020). We followed a similar methodology as described above to align our transcriptomes and removed protein-coding genes using UniProt database (release October 2024), small non-codingRNAs (sncRNAs), and the list of known lncRNAs identified above (identity >80%, e-value <0.001).

To eliminate reads with coding potential we used CPAT-v3.0.0 (Wang et al. 2013). Using the CPAT toolbox we constructed a hexamer frequency table and a logistic regression model with coding probability as the response variable. Both parameters were assembled with coding and non-coding sequences obtained from the Xenopus laevis v-10.1 genomic assembly (GCF_017654675.1, [International Xenopus Sequencing Consortium 2021, April 12]). With the statsmodels-v0.14.4 package (Perktold et al. 2010) we used the coding probabilities to calculate an FDR and filter those entries with an FDR < 0.0001.

To eliminate potentially conserved protein domains, we translated the RNA sequences using the EMBOSS-v6.5.7 (Rice et al. 2000) package and aligned them to the Pfam-v38 (Mistry et al. 2021) repository using Hmmer-v3.4 (Eddy 2011) eliminating significant hits (e-value < 0.01).

We then aligned our sequences to a protein-coding reference transcriptome of D. arenicolor brain constructed in our workgroup (Hernández-Herrera and Pérez-Mendoza, in preparation, available at TSA under BioProject number PRJNA1356325), using MMseqs2 as explained previously. We eliminated all sequences with >95% identity and e-value < 0.01.

WGCNA

To identify potential biological roles of our de novo lncRNAs we first obtained a transcripts per million (TPM)-normalized global expression matrix across our samples using Salmon-v10.0.1 (Patro et al. 2017 ). We eliminated genes with low expression across all samples (µ > 1) and selected the topmost variable genes (top 75% of genes by variance).

We fed the expression matrix to the WGCNA-v1.73 (Langfelder and Horvath 2008) to identify potential expression modules. We assigned colors to the matrix and plotted the cluster dendrogram of highly correlated gene modules. For subsequent analyses, we selected only the modules that included any of our de novo lncRNAs. We took the UniProt identifier of the co-expressed protein-coding genes and fed them to the BioMart-v2.58.2 (Smedley et al. 2009) to obtain their respective ENTREZ IDs.

Finally, we used the lists of ENTREZ IDs of each module with clusterProfiler-v4.10.1 (Yu et al. 2012) to obtain the known functions of said proteins in Gene Ontology. We used a Bonferroni correction and considered results with an adj. P-value <0.01 as significant. For a comprehensive description of the bioinformatic workflow see Supplementary Fig. 4.

Results

Sequencing and transcriptome assembly

Following quality trimming and filtering (Supplementary Figs. 5 and 6), a total of approximately 412 million clean reads were retained, representing 87.1% of the original dataset. GC percentages across samples were around 45%, with curves that resemble a normal distribution. Mean quality scores by sample and by read exceeded Phred 35 in all reads. Adapter sequences, low-quality regions, and uncalled bases were effectively removed. Duplication and overrepresentation levels were comparable to those observed in the raw data.

Using the clean reads, we performed de novo transcriptome assembly for each pool, generating stage-specific reference assemblies and a global reference transcriptome for the complete dataset (Fig. 1a). In the individual assemblies no statistically significant differences were found among stages in number of genes (F = 1.488, P = 0.299) or transcripts (F = 1.554, P = 0.286). This trend persisted in the stage-specific reference transcriptomes (Fig. 1b), where the Climax stage showed the fewest genes (∼700 K) and transcripts (∼1.25 M), followed by the Adult stage with ∼1.4 M genes (∼2.2 M transcripts) and finally the Pre-metamorphic stage with around 1.6 M genes and ∼2.5 M transcripts. The global reference transcriptome had ∼2.2 M unique genes and close to 5 M total transcripts.

We evaluated the fragmentation levels of the individual transcriptomes (Fig. 1c) and reference transcriptomes (Fig. 1d). We tested for differences among the individual transcriptomes in mean contig size (F = 2.785, P = 0.139), contig median (F = 0.082, P = 0.923), and N50 (F = 2.883, P = 0.133) and found no statistically significant results. For the reference transcriptomes we found that the Climax stage had the highest contig mean (474.38 bp), median (324 bp), and N50 (497 bp); the Adult and Pre stages showed very similar values, and the Reference transcriptome had a mean contig size of 446 bp, median contig size of 329 bp and N50 of 441.25 bp.

We measured the percentage of representation of the original reads in the assembled transcriptomes (Supplementary Fig. 7). All samples had at least 75% read representation in the final transcriptomes, with the lowest 2 values being the P1 and A1 samples, both with 75.62%; the rest of the samples showed representation levels of ∼90%. We tested for differences in the levels of read representation and found no significant result between stages (F = 1.12, P = 0.386).

We evaluated the Ex90N50 values (Supplementary Fig. 8) to assess transcript contiguity among highly expressed transcripts. The Climax, Adult, and global reference transcriptomes exhibited identical Ex90N50 curves. In contrast, the curve for the Pre-metamorphic stage showed greater variability among lowly expressed contigs. However, starting at the 40th percentile of expressed transcripts, its curve closely aligned with those of the other stages.

We assessed the completeness of our de novo transcriptomes using the Tetrapoda BUSCO dataset (Table 1). All transcriptomes showed high completeness, with over 80% of BUSCO genes detected as complete. The global Reference transcriptome had the highest completeness at 86.7%. In the stage-specific assemblies, we observed consistently high levels of single-copy genes (∼36%) and duplicated genes (∼30%), while the proportions of fragmented (<7%) and missing (∼13%) BUSCOs were low, with the global reference exhibiting the fewest missing genes.

Table 1.

BUSCO representation for de novo transcriptomes and the global reference transcriptome across Eukaryota, Vertebrata, and Tetrapoda datasets.

BUSCO Stage Complete Single Duplicated Fragmented Missing
Eukaryota (N = 255) Pre 254 (99.6%) 109 (42.7) 145 (56.9%) 1 (0.4%) 0
Climax 253 (99.2%) 127 (49.8%) 126 (49.4%) 1 (0.4%) 1 (0.4%)
Adult 253 (99.2%) 127 (49.8%) 126 (49.4%) 1 (0.4%) 1 (0.4%)
Reference 255 (100%) 39 (15.3%) 216 (84.7%) 0 0
Vertebrata (N = 3354) Pre 2,916 (87%) 1,240 (37%) 1,676 (50%) 236 (7%) 202 (6%)
Climax 2,959 (88.2%) 1,402 (41.8%) 1,557 (46.4%) 184 (5.5%) 211 (6.3%)
Adult 2,920 (87.1%) 1,251 (37.3%) 1,669 (49.8%) 216 (6.4%) 218 (6.5%)
Reference 3,076 (91.7%) 562 (16.8%) 2,514 (75%) 142 (4.2%) 136 (4.1%)
Tetrapoda (N = 5310) Pre 4,276 (80.5%) 1,860 (35%) 2,416 (45.5%) 355 (6.7%) 679 (12.8%)
Climax 4,327 (81.5%) 2,011 (37.9%) 2,316 (43.6%) 289 (5.4%) 694 (13.1%)
Adult 4,248 (80%) 1,836 (34.6%) 2,412 (45.4%) 336 (6.3%) 726 (13.7%)
Reference 4,603 (86.7%) 936 (17.6%) 3,667 (69.1%) 223 (4.2%) 484 (9.1%)

The table shows stage-specific representation of Complete, Single-Copy, Duplicated, Fragmented, and Missing BUSCO genes, reported as both total counts and percentages.

When assessed with the Vertebrata BUSCO dataset, completeness scores increased across all assemblies, with complete BUSCOs around 90%, reaching 91.7% in the global Reference. These results also reflected higher single-copy gene representation (∼40%) and consistently low levels of fragmented and missing genes (<7%). Using the Eukaryota dataset, all transcriptomes showed nearly complete BUSCO recovery, with over 99% completeness (100% in the Reference transcriptome).

Detection and distribution of annotated lncRNAs

Using LNCipedia, we identified a total of 4,557 known lncRNAs across all de novo assemblies (Fig. 2a). Of these, 1,329 were unique to the Pre-metamorphic stage, 392 lncRNAs were exclusive to the Climax stage, and 1,087 lncRNAs were specific to the Adult stage. Additionally, 628 lncRNAs were common among all 3 stages, while 708 were shared between Pre and Adult, 208 between Pre and Climax stages, and 205 shared genes between Climax and Adult.

Fig. 2.

Fig. 2.

Annotated lncRNA identification in reference transcriptomes. a) Venn diagram of detected known lncRNAs in our dataset. We found 1,329 unique lncRNAs in the Pre-stage, 392 lncRNAs for the Climax stage, and 1,087 genes for the Adult stage. Among the stages, we found 208 common lncRNAs between the Pre and Climax stages, 205 between the Climax and Adult stages, 708 between the Pre and Adult stages, and 628 across all stages. b) Gene enrichment analysis of lncRNAs in the Pre-metamorphic stage, we found significantly enriched processes such as macromolecule synthesis, metabolic involvement, and gene silencing. c) Gene enrichment analysis of lncRNAs in the Climax stage, we found significantly enriched post-transcriptional processes, and heterochromatin formation. d) Gene enrichment analysis of lncRNAs in Adult stages, we detected significantly enriched processes associated with ribonucleoprotein complexes and gene silencing mediated by the RISC complex.

We used g:Profiler to search for the biological roles where our lncRNAs could be involved. For the Pre-stage (Fig. 2b), we found lncRNAs associated with relevant biological roles, such as macromolecule synthesis, metabolic roles, and gene silencing. In the Climax stage (Fig. 2c), we obtained genes that regulate biosynthesis, post-transcriptional regulation of gene expression, heterochromatin formation, and processes associated with sexual differentiation. The Adult stage (Fig. 2d) showed enriched processes such as signal recognition, ribonucleoprotein complexes, and gene expression regulation via the RNA-induced silencing complex (RISC).

Discovery of novel lncRNAs

After sequential filtering (Supplementary Fig. 9), we identified 4,836 putative de novo lncRNAs (Fig. 3). Of these, 319 genes were unique to the Pre-metamorphic stage, 1,929 were specific to the Climax stage, and 400 to the Adult stage. We found 317 shared lncRNAs between the Pre and Climax stages, 631 common lncRNAs between the Climax and Adult stages, 92 lncRNAs in both the Pre and Adult stages, and 1,148 lncRNAs in all stages.

Fig. 3.

Fig. 3.

Putative lncRNAs identified de novo Venn diagram of the unique and shared putative lncRNAs shared among stages. We found 319 unique lncRNAs in the Pre-metamorphic stage, 1,929 in the Climax stage, and 400 in the Adult stage. Shared among our groups, we found 317 between the Pre and Climax stages, 631 between the Climax and Adult stages, and 92 between the Pre and Adult stages. Among all stages, we found 1,148 common lncRNAs.

Co-expression network analysis and functional inference

To identify potential biological roles of these lncRNAs, we performed a Weighted Global Correlation Network Analysis (WGCNA) and obtained a total of 45,845 genes distributed across 122 modules (Supplementary Fig. 10). We filtered only those modules that included new putative lncRNAs to perform gene enrichment analysis in Gene Ontology. We obtained a total of 32 lncRNAs across 18 modules.

We performed Gene Ontology enrichment analysis (Fig. 4) on the genes within each module. The green module was enriched for processes related to protein heterodimerization, protein-DNA complex assembly, nucleosome and telomere organization, and structural components of chromatin. The tan module showed enrichment for epigenetic regulation, while the turquoise module was associated with GTP binding.

Fig. 4.

Fig. 4.

Gene enrichment analysis of lncRNA-containing co-expression modules top 25 enriched GO terms by module. We found enriched processes such as protein heterodimerization, structural organization and composition of the chromatin, organization of telomeres and chromosomes, epigenetic regulation, and GTPase activity.

Discussion

We assembled de novo transcriptomes for 3 different life cycle stages of D. arenicolor. Our analysis included the detection of previously annotated lncRNAs, identification of novel putative lncRNAs, and coexpression network analyses to infer their potential biological roles. We identified putative de novo lncRNAs by eliminating sequences with coding potential, known protein domains, or identified as potentially coding in this species.

We quantified the size of both individual sample transcriptomes and the references we constructed per stage (Fig. 1). There is an apparent tendency for the Climax to have the smallest transcriptomes in both total genes and total transcripts, nevertheless, we found no significant differences between stages (Fig. 1, a and b). Similar sizes were reported for the brains of Bombina pachypus (Chiocchio et al. 2022), and Xenopus andrei (Pownall et al. 2018). Although our results suggest stable transcriptome sizes across stages, expanding sample sizes could reveal subtle differences across life cycle stages of D. arenicolor. To our knowledge, there are no reports of differences in the size of the transcriptome of multiple stages of the life cycle in anurans. Instead, there are reports of differentially expressed genes between said stages that determine sex development. Tang et al. (2021), the development of the hindlimb muscle (Shu et al. 2021), and even tail reabsorption (Wang et al. 2022) to name a few.

Assembly quality metrics, including contig size and N50 values, showed no significant differences among stages. We evaluated different statistics from the resulting Trinity assemblies and the reference transcriptomes (Fig. 1, c and d). Although there is an apparent difference between stages in contig size and N50, the results were not statistically significant; equivalent results were obtained previously (Zhao et al. 2014; Smirnov et al. 2022). While absolute N50 values may seem modest, this is expected in de novo transcriptomes from non-model organisms, which often lack high-quality genomic references (O’Neil and Emrich 2013), and may be limited by insert size during library construction due to the use of 150 bp read length (Hara et al. 2015).

When we evaluated the completeness of BUSCO genes against Eukaryota database (Table 1) we found almost full representation, compared to Vertebrata the representation levels were around 90%, finally compared against Tetrapoda all our datasets showed percentages above the 80% threshold. Considering that our transcriptomes come from a single tissue of a non-model organism constructed with a modest sample, we achieved robust assemblies. Comparable results were previously reported (Smirnov et al. 2022) against Tetrapoda, and the higher taxa reference (Ospina et al. 2021; Chiocchio et al. 2022; Libro et al. 2022).

Alignment to LNCipedia revealed high numbers of known lncRNAs in the Pre-metamorphic and Adult stages, and a lower number in the Climax stage (Fig. 2). Conversely, most novel lncRNAs were detected during the Climax stage (Fig. 3). The high number of unique transcripts associated with the Climax likely reflects the extensive morphological and physiological remodeling occurring during this critical transition period (Miyata and Ose 2012; Paul et al. 2022); such as an increased Wnt signaling activity (Qi et al. 2021), purine and pyrimidine metabolism, or apoptosis of skin cells, and differentiation of epidermis (Corrie et al. 2024). This pattern is consistent with findings in other metamorphosing species including Sarcophaga peregrina (Shang et al. 2023) Drosophila (Chen et al. 2016), Ciona savignyi (Wei and Dong 2018), Bombyx mori (Fu et al. 2022), and the frog Microhyla fissipes (Liu et al. 2023).

Common lncRNAs across stages could be involved in housekeeping roles such as gene regulation, regulation of biosynthetic processes, and the structure of the heterochromatin (Grammatikakis and Lal 2022; Herman et al. 2022). Alternatively, lncRNAs only detected in 1 stage could be involved in stage-specific relevant biological processes such as regulation of eye photoreceptor cell development in the Pre-metamorphic stage (Fig. 2b) and the inactivation of the X chromosome in the Climax stage (Fig. 2c), examples of this have been reported previously such as lncRNA involvement in the correct development of the eye photoreceptors in mouse (Chen et al. 2021; Zhang et al. 2024) and the inactivation of the X chromosome through XIST (Lu et al. 2020; Aguilar et al. 2022).

After eliminating transcripts with coding characteristics, we identified just over 4,800 putative novel lncRNAs (Fig. 3), a result consistent with findings in Rana catesbeiana (Hammond et al. 2017). In contrast, significantly higher numbers have been reported (around 50,000 lncRNAs) in R. arenarum (Ceschin et al. 2020), and approximately 80,000 lncRNAs in Pelobates cultripes (Liedtke et al. 2022). In contrast, we found significantly fewer transcripts; however, these studies were based on whole-genome assembly and whole-organism transcriptomes, respectively (Liedtke et al. 2019; Ceschin et al. 2020; Liedtke et al. 2022).

Notably, our results suggest that the majority of functionally relevant lncRNAs during this crucial developmental window remain uncharacterized, highlighting a key area for future research. Given that lncRNAs highly exhibit tissue-specific expression patterns (Jiang et al. 2016), expanding the range of sampled tissues or performing whole-genome sequencing would likely increase the number of detectable lncRNAs in D. arenicolor.

The co-expression modules where putative de novo lncRNAs are involved showed enrichment in relevant molecular functions of lncRNAs (Fig. 4), including the spatial organization of chromatin at multiple levels like nucleosomes, heterochromatin, telomeres, and entire chromosomes. LncRNAs are increasingly recognized as key modulators of chromatin architecture. For example, Suv39h1as, an antisense lncRNA, modulates its locus (in cis) regulating pluripotency in mouse embryonic Stem Cells (Bernard et al. 2022), while HOTAIR alters gene expression during epithelial–mesenchymal transition by sequestering a lysine demethylase (Jarroux et al. 2021). There have been reports of LncRNAs that can regulate telomeric homeostasis, namely TERRA, and alterations in its expression pattern can lead to genomic instability (Oliva-Rico et al. 2022) and even cancer (Xu et al. 2024).

The main limitation of our work was our use of human annotations to identify known lncRNAs and predict their functions, as well as the enriched GO terms of the co-expression modules. This limitation emerges given the absence of detailed lncRNA reference datasets for amphibians and the limited available information about lncRNAs in the nearest phylogenetically related species (Xenopus). As currently there is not much evidence to validate their roles in non-model species, these results should be further investigated to elucidate their function in Anurans. This is the first de novo identification of lncRNAs for D. arenicolor. We provide evidence that both known and novel lncRNAs are expressed during brain development and metamorphosis and may participate in a range of essential regulatory functions.

Supplementary Material

jkaf283_Supplementary_Data

Acknowledgments

All authors have revised and approved the last version of the manuscript. This paper fulfils the requirements for a Doctoral degree at Posgrado en Ciencias Biológicas, UNAM, for H.H.-O. The authors would like to thank María Fernanda Vela Corona for her assistance in designing the bioinformatic workflow diagram. We like to thank Dirección General de Asuntos del Personal Académico, Universidad Nacional Autónoma de México support through the grant IN219925 for H.A.P.-M., and DGAPA-UNAM-PASPA supporting H.A.P.-M. Also we like to thank Secretaría de Ciencias, Humanidades Tecnología e Innovación for supporting H.H.-O. during his PhD studies.

Contributor Information

Héctor Herrera-Orozco, Posgrado en Ciencias Biológicas, Universidad Nacional Autónoma de México, Ciudad Universitaria, Coyoacán, CDMX 04510, México; Facultad de Estudios Superiores Iztacala, Laboratorio de Ecología Evolutiva y Conservación de Anfibios y Reptiles, Universidad Nacional Autónoma de México, Tlalnepantla de Baz 54090, México.

Carolina Rodríguez-Ibarra, Facultad de Estudios Superiores Iztacala, Unidad de Investigación en Biomedicina, Universidad Nacional Autónoma de México, Tlalnepantla de Baz 54090, México.

Cristian Ivan Hernández-Herrera, Posgrado en Ciencias Biológicas, Universidad Nacional Autónoma de México, Ciudad Universitaria, Coyoacán, CDMX 04510, México; Facultad de Estudios Superiores Iztacala, Laboratorio de Ecología Evolutiva y Conservación de Anfibios y Reptiles, Universidad Nacional Autónoma de México, Tlalnepantla de Baz 54090, México.

Clara Estela Díaz-Velásquez, Facultad de Estudios Superiores Iztacala, Unidad de Investigación en Biomedicina, Universidad Nacional Autónoma de México, Tlalnepantla de Baz 54090, México; Facultad de Estudios Superiores Iztacala, Laboratorio Nacional en Salud: Diagnóstico Molecular y Efecto Ambiental en Enfermedades Crónico-Degenerativas, Universidad Nacional Autónoma de México, Tlalnepantla de Baz 54090, México.

Felipe Vaca-Paniagua, Facultad de Estudios Superiores Iztacala, Unidad de Investigación en Biomedicina, Universidad Nacional Autónoma de México, Tlalnepantla de Baz 54090, México; Facultad de Estudios Superiores Iztacala, Laboratorio Nacional en Salud: Diagnóstico Molecular y Efecto Ambiental en Enfermedades Crónico-Degenerativas, Universidad Nacional Autónoma de México, Tlalnepantla de Baz 54090, México.

Hibraim Adán Pérez-Mendoza, Facultad de Estudios Superiores Iztacala, Laboratorio de Ecología Evolutiva y Conservación de Anfibios y Reptiles, Universidad Nacional Autónoma de México, Tlalnepantla de Baz 54090, México.

Data availability

Both raw sequence files (SRA) and transcriptomes of individual samples and the global reference were deposited at NCBI under BioProject ID PRJNA1295574. The code used for this paper and a comprehensive list of software and versions can be found in the GitHub repository: https://github.com/DarkHe007/LncRNAs-in-Dryophytes-arenicolor/tree/main

Supplemental material available at G3 online.

Funding

This work was supported by Programa de Apoyo a Proyectos de Investigación e Innovación Tecnológica by Dirección General de Asuntos de Personal Académico, Universidad Nacional Autónoma de México (IN219925), and Programa de Apoyos para la Superación del Personal Académico DGAPA-UNAM, with a grant for HAP-M; Secretaría de Ciencias, Humanidades, Tecnología e Innovación (CVU: 1096935) with a doctoral degree scholarship to HH-O.

Author contributions

Héctor Herrera-Orozco and Hibraim Adán Pérez-Mendoza (Conceptualization, Writing—Review and editing), Héctor Herrera-Orozco, Carolina Rodríguez-Ibarra, and Cristian Ivan Hernández-Herrera (Methodology, Analysis and Investigation), Hibraim Adán Pérez-Mendoza and Felipe Vaca-Paniagua (Funding Acquisition and Resources), and Clara Estela Díaz-Velásquez (Supervision).

Ethics approval

All procedures involving the collection, transportation, and handling of animals were conducted under permits issued by the Dirección General de Vida Silvestre, Secretaría de Medio Ambiente y Recursos Naturales. All experimental procedures conducted in this study were reviewed and approved by the Ethical Committee of the Facultad de Estudios Superiores Iztacala, UNAM (Ce/FESI/122023/1545). Human Ethics not applicable.

Literature cited

  1. Aguilar  R  et al.  2022. Targeting Xist with compounds that disrupt RNA structure and X inactivation. Nature. 604:160–166. 10.1038/s41586-022-04537-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Babraham Bioinformatics . 2019. FastQC: a quality control tool for high throughput sequence data. [accessed 2023 Feb 19]. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/.
  3. Bernard  LD  et al.  2022. OCT4 activates a Suv39h1-repressive antisense lncRNA to couple histone H3 Lysine 9 methylation to pluripotency. Nucleic Acids Res. 50:7367–7379. 10.1093/nar/gkac550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Birol  I  et al.  2015. De novo transcriptome assemblies of Rana (Lithobates) catesbeiana and Xenopus laevis tadpole livers for comparative genomics without reference genomes. PLoS One. 10:e0130720. 10.1371/journal.pone.0130720. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bolger  AM, Lohse  M, Usadel  B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 30:2114–2120. 10.1093/BIOINFORMATICS/BTU170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Ceschin  DG  et al.  2020. The Rhinella arenarum transcriptome: de novo assembly, annotation and gene prediction. Sci Rep. 10:1053. 10.1038/s41598-020-57961-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Chen  B  et al.  2016. Genome-wide identification and developmental expression profiling of long noncoding RNAs during Drosophila metamorphosis. Sci Rep. 6:23330. 10.1038/srep23330. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Chen  G  et al.  2021. Whole transcriptome sequencing identifies key circRNAs, lncRNAs, and miRNAs regulating neurogenesis in developing mouse retina. BMC Genomics. 22:779. 10.1186/s12864-021-08078-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Chiocchio  A  et al.  2022. Brain de novo transcriptome assembly of a toad species showing polymorphic anti-predatory behavior. Sci Data. 9:619. 10.1038/s41597-022-01724-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Choudhuri  S. 2023. Long noncoding RNAs: biogenesis, regulation, function, and their emerging significance in toxicology. Toxicol Mech Methods. 33:541–551. 10.1080/15376516.2023.2197489. [DOI] [PubMed] [Google Scholar]
  11. Corrie  LM, Kuecks-Winger  H, Ebrahimikondori  H, Birol  I, Helbing  CC. 2024. Transcriptomic profiling of Rana [Lithobates] catesbeiana back skin during natural and thyroid hormone-induced metamorphosis under different temperature regimes with particular emphasis on innate immune system components. Comp Biochem Physiol Part D Genomics Proteomics. 50:101238. 10.1016/J.CBD.2024.101238. [DOI] [PubMed] [Google Scholar]
  12. Croter  BI. 2005. Amphibians and reptiles of New Mexico. Environ Conserv. 32:371–371. 10.1017/S037689290621292X. [DOI] [Google Scholar]
  13. Deng  P, Liu  S, Nie  X, Weining  S, Wu  L. 2018. Conservation analysis of long non-coding RNAs in plants. Sci China Life Sci. 61:190–198. 10.1007/s11427-017-9174-9. [DOI] [PubMed] [Google Scholar]
  14. Djebali  S  et al.  2012. Landscape of transcription in human cells. Nature. 489:101–108. 10.1038/nature11233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. do Amaral  MCF, Frisbie  J, Crum  RJ, Goldstein  DL, Krane  CM. 2020. Hepatic transcriptome of the freeze-tolerant Cope’s gray treefrog, Dryophytes chrysoscelis: responses to cold acclimation and freezing. BMC Genomics. 21:226. 10.1186/s12864-020-6602-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Eddy  SR. 2011. Accelerated profile HMM searches. PLoS Comput Biol. 7:e1002195. 10.1371/journal.pcbi.1002195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Eriksson  M, Rafajlović  M. 2022. The role of phenotypic plasticity in the establishment of range margins. Philos Trans R Soc B Biol Sci. 377:20210012. 10.1098/rstb.2021.0012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Ewels  P, Magnusson  M, Lundin  S, Käller  M. 2016. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 32:3047–3048. 10.1093/bioinformatics/btw354. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Forouzmand  E  et al.  2017. Developmentally regulated long non-coding RNAs in Xenopus tropicalis. Dev Biol. 426:401–408. 10.1016/J.YDBIO.2016.06.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Fox  J, Weisberg  S. 2018. An R companion to applied regression. 3rd ed. Sage. [Google Scholar]
  21. Fu  Y  et al.  2022. Long noncoding RNA lncR17454 regulates metamorphosis of silkworm through let-7 miRNA cluster. J Insect Sci. 22:12–13. 10.1093/jisesa/ieac028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Fu  L, Niu  B, Zhu  Z, Wu  S, Li  W. 2012. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 28:3150–3152. 10.1093/bioinformatics/bts565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. García  E.  2004. Modificaciones al Sistema de Clasificación Climática de Köppen. Universidad Autónoma de México Instituto de Geografia. [accessed 2025 Jan 19]. https://librosoa.unam.mx/handle/123456789/1372.
  24. Gosner  KL. 1960. A simplified table for staging anuran embryos and Larvae with notes on identification. Herpetologica. 16:183–190. https://www.jstor.org/stable/3890061. [Google Scholar]
  25. Grammatikakis  I, Lal  A. 2022. Significance of lncRNA abundance to function. Mamm Genome. 33:271–280. 10.1007/s00335-021-09901-4. [DOI] [PubMed] [Google Scholar]
  26. Gregory  T.  2025. Animal Genome Size Database. https://www.genomesize.com/index.php [accessed 2025 Feb 26].
  27. Haas  BJ  et al.  2013. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc. 8:1494–1512. 10.1038/nprot.2013.084. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Hammond  SA  et al.  2017. The North American bullfrog draft genome provides insight into hormonal regulation of long noncoding RNA. Nat Commun. 8:1433. 10.1038/s41467-017-01316-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Hara  Y  et al.  2015. Optimizing and benchmarking de novo transcriptome sequencing: from library preparation to assembly evaluation. BMC Genomics. 16:977. 10.1186/s12864-015-2007-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Herman  AB, Tsitsipatis  D, Gorospe  M. 2022. Integrated lncRNA function upon genomic and epigenomic regulation. Mol Cell. 82:2252–2266. 10.1016/j.molcel.2022.05.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Hernández-Herrera  CI, Pérez-Mendoza  HA. 2021. Acoustic and morphological variation on two populations of Dryophytes arenicolor in Central México. Bioacoustics. 30:366–377. 10.1080/09524622.2020.1760937. [DOI] [Google Scholar]
  32. Hernández-Herrera  CI, Pérez-Mendoza  HA, Fornoni  J. 2024. Geographic variation in developmental plasticity among populations of the canyon treefrog in response to temperature and pond-drying. J Zool. 324:103–117. 10.1111/JZO.13202. [DOI] [Google Scholar]
  33. International Xenopus Sequencing Consortium . 2021. NCBI RefSeq assembly (GCF_017654675.1). [accessed 2025 May 29]. https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_017654675.1/.
  34. Jafari-Raddani  F  et al.  2022. An overview of long noncoding RNAs: biology, functions, therapeutics, analysis methods, and bioinformatics tools. Cell Biochem Funct. 40:800–825. 10.1002/cbf.3748. [DOI] [PubMed] [Google Scholar]
  35. Jarroux  J  et al.  2021. HOTAIR lncRNA promotes epithelial-mesenchymal transition by redistributing LSD1 at regulatory chromatin regions. EMBO Rep. 22:e50193. 10.15252/embr.202050193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Jiang  C  et al.  2016. Identifying and functionally characterizing tissue-specific and ubiquitously expressed human lncRNAs. Oncotarget. 7:7120–7133. 10.18632/oncotarget.6859. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Kashyap  A  et al.  2020. Pan-tissue transcriptome analysis of long noncoding RNAs in the American beaver Castor canadensis. BMC Genomics. 21:153. 10.1186/s12864-019-6432-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Kolberg  L, Raudvere  U, Kuzmin  I, Vilo  J, Peterson  H. 2020. Gprofiler2—an R package for gene list functional enrichment analysis and namespace conversion toolset g:Profiler. F1000Res. 9:709. 10.12688/f1000research.24956.2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Kosch  TA  et al.  2024. The Amphibian Genomics Consortium: advancing genomic and genetic resources for amphibian research and conservation. BMC Genomics. 25:1025. 10.1186/s12864-024-10899-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Kosch  TA  et al.  2025. Comparative analysis of amphibian genomes: an emerging resource for basic and applied research. Mol Ecol Resour. 25:e14025. 10.1111/1755-0998.14025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Lamichhaney  S  et al.  2021. A bird-like genome from a frog: mechanisms of genome size reduction in the ornate burrowing frog, Platyplectrum ornatum. Proc Natl Acad Sci U S A. 118:e2011649118. 10.1073/pnas.2011649118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Langfelder  P, Horvath  S. 2008. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 9:559. 10.1186/1471-2105-9-559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Langmead  B, Salzberg  SL. 2012. Fast gapped-read alignment with Bowtie 2. Nat Methods. 9:357–359. 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Larsson  J, Gustafsson  P.  2018. A case study in fitting area-proportional Euler diagrams with ellipses using eulerr. SetVR Conference 84–91 [accessed 2025 Oct 22].
  45. Libro  P  et al.  2022. First brain de novo transcriptome of the Tyrrhenian tree frog, Hyla sarda, for the study of dispersal behavior. Front Ecol Evol. 10:947186. 10.3389/fevo.2022.947186. [DOI] [Google Scholar]
  46. Liedtke  HC  et al.  2019. De novo assembly and annotation of the larval transcriptome of two spadefoot toads widely divergent in developmental rate. G3 (Bethesda). 9:2647–2655. 10.1534/g3.119.400389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Liedtke  HC  et al.  2022. Chromosome-level assembly, annotation and phylome of Pelobates cultripes, the western spadefoot toad. DNA Res. 29:dsac013. 10.1093/dnares/dsac013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Liscano Martinez  Y, Arenas Gómez  CM, Smith  J, Delgado  JP. 2020. A tree frog (Boana pugnax) dataset of skin transcriptome for the identification of biomolecules with potential antimicrobial activities. Data Brief. 32:106084. 10.1016/j.dib.2020.106084. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Liu  L  et al.  2023. Identification of thyroid hormone response genes in the remodeling of dorsal muscle during Microhyla fissipes metamorphosis. Front Endocrinol (Lausanne). 14:1099130. 10.3389/fendo.2023.1099130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Lu  Z  et al.  2020. Structural modularity of the XIST ribonucleoprotein complex. Nat Commun. 11:6163. 10.1038/s41467-020-20040-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Manni  M, Berkeley  MR, Seppey  M, Simão  FA, Zdobnov  EM. 2021. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol. 38:4647–4654. 10.1093/molbev/msab199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Mistry  J  et al.  2021. Pfam: the protein families database in 2021. Nucleic Acids Res. 49:D412–D419. 10.1093/NAR/GKAA913. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Miyata  K, Ose  K. 2012. Thyroid hormone-disrupting effects and the amphibian metamorphosis assay. J Toxicol Pathol. 25:1–9. 10.1293/tox.25.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Motheramgari  K  et al.  2020. Expanding the Chinese hamster ovary cell long noncoding RNA transcriptome using RNASeq. Biotechnol Bioeng. 117:3224–3231. 10.1002/bit.27467. [DOI] [PubMed] [Google Scholar]
  55. Nemeth  K, Bayraktar  R, Ferracin  M, Calin  GA. 2024. Non-coding RNAs in disease: from mechanisms to therapeutics. Nat Rev Genet. 25:211–232. 10.1038/s41576-023-00662-1. [DOI] [PubMed] [Google Scholar]
  56. Oliva-Rico  D  et al.  2022. Methylation of subtelomeric chromatin modifies the expression of the lncRNA TERRA, disturbing telomere homeostasis. Int J Mol Sci. 23:3271. 10.3390/ijms23063271. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. O’Neil  ST, Emrich  SJ. 2013. Assessing de novo transcriptome assembly metrics for consistency and utility. BMC Genomics. 14:465. 10.1186/1471-2164-14-465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Ospina  OE  et al.  2021. Neurogenomic divergence during speciation by reinforcement of mating behaviors in chorus frogs (Pseudacris). BMC Genomics. 22:711. 10.1186/s12864-021-07995-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Patro  R, Duggal  G, Love  MI, Irizarry  RA, Kingsford  C. 2017. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods. 14:417–419. 10.1038/NMETH.4197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Paul  B, Sterner  ZR, Buchholz  DR, Shi  Y-B, Sachs  LM. 2022. Thyroid and corticosteroid signaling in amphibian metamorphosis. Cells. 11:1595. 10.3390/cells11101595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Perktold  J, Seabold  S, Taylor  J; statsmodels-developers . 2010. 9th Python in Science Conference. [accessed 2025 Feb 12]. https://www.statsmodels.org/devel/.
  62. Pownall  ME, Cutler  RR, Saha  MS. 2018. Transcriptome of Xenopus andrei, an octoploid frog, during embryonic development. Data Brief. 19:501–505. 10.1016/j.dib.2018.05.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Preest  MR, Brust  DG, Wygoda  ML. 1992. Cutaneous water loss and the effects of temperature and hydration state on aerobic metabolism of Canyon treefrogs, Hyla arenicolor. Herpetologica. 48:210–219. https://www.jstor.org/stable/3892674. [Google Scholar]
  64. Qi  X  et al.  2021. Comprehensive analysis of differences of N6-methyladenosine of lncRNAs between atrazine-induced and normal Xenopus laevis testis. Genes Environ.  43:49. 10.1186/s41021-021-00223-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. R Core Team . 2023. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/. [Google Scholar]
  66. Raj  S, Sifuentes  CJ, Kyono  Y, Denver  RJ. 2023. Metamorphic gene regulation programs in Xenopus tropicalis tadpole brain. PLoS One. 18:e0287858. 10.1371/JOURNAL.PONE.0287858. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Rice  P, Longden  I, Bleasby  A. 2000. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 16:276–277. 10.1016/s0168-9525(00)02024-2. [DOI] [PubMed] [Google Scholar]
  68. Sai  L  et al.  2019. Analysis of long non-coding RNA involved in atrazine-induced testicular degeneration of Xenopus laevis. Environ Toxicol. 34:505–512. 10.1002/tox.22704. [DOI] [PubMed] [Google Scholar]
  69. Shang  Y  et al.  2023. Genome-wide analysis of long noncoding RNAs and their association in regulating the metamorphosis of the Sarcophaga peregrina (Diptera: Sarcophagidae). PLoS Negl Trop Dis. 17:e0011411. 10.1371/journal.pntd.0011411. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Shu  Y  et al.  2021. Dynamic transcriptome and histomorphology analysis of developmental traits of hindlimb thigh muscle from Odorrana tormota and its adaptability to different life history stages. BMC Genomics. 22:369. 10.1186/s12864-021-07677-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Smedley  D  et al.  2009. BioMart–biological queries made easy. BMC Genomics. 10:22. 10.1186/1471-2164-10-22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Smirnov  DN  et al.  2022. De novo assembly and analysis of the transcriptome of the Siberian wood frog Rana amurensis. Vavilovskii Zhurnal Genet Selektsii. 26:109–116. 10.18699/VJGB-22-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Stamatoyannopoulos  JA  et al.  2012. An encyclopedia of mouse DNA elements (Mouse ENCODE). Genome Biol. 13:418. 10.1186/gb-2012-13-8-418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Steinegger  M, Söding  J. 2017. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol. 35:1026–1028. 10.1038/nbt.3988. [DOI] [PubMed] [Google Scholar]
  75. Tang  Y, Chen  J-Y, Ding  G-H, Lin  Z-H. 2021. Analyzing the gonadal transcriptome of the frog Hoplobatrachus rugulosus to identify genes involved in sex development. BMC Genomics. 22:552. 10.1186/s12864-021-07879-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Toden  S, Zumwalt  TJ, Goel  A. 2021. Non-coding RNAs and potential therapeutic targeting in cancer. Biochim Biophys Acta Rev Cancer. 1875:188491. 10.1016/j.bbcan.2020.188491. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Volders  P-J  et al.  2019. LNCipedia 5: towards a reference set of human long non-coding RNAs. Nucleic Acids Res. 47:D135–D139. 10.1093/nar/gky1031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Wang  L  et al.  2013. CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model. Nucleic Acids Res. 41:e74–e74. 10.1093/nar/gkt006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Wang  S  et al.  2019. Gene expression program underlying tail resorption during thyroid hormone-dependent metamorphosis of the ornamented pygmy frog Microhyla fissipes. Front Endocrinol (Lausanne). 10:11. 10.3389/fendo.2019.00011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Wang  H, Liu  Y, Chai  L, Hongyuan  W. 2022. Morphology and molecular mechanisms of tail resorption during metamorphosis in Rana chensinensis tadpole (Anura: Ranidae). Comp Biochem Physiol Part D Genomics Proteomics. 41:100945. 10.1016/j.cbd.2021.100945. [DOI] [PubMed] [Google Scholar]
  81. Weghorst  F, Torres Marcén  M, Faridi  G, Lee  YCG, Cramer  KS. 2024. Deep conservation and unexpected evolutionary history of neighboring lncRNAs MALAT1 and NEAT1. J Mol Evol. 92:30–41. 10.1007/s00239-023-10151-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Wei  J, Dong  B. 2018. Identification and expression analysis of long noncoding RNAs in embryogenesis and larval metamorphosis of Ciona savignyi. Mar Genomics. 40:64–72. 10.1016/j.margen.2018.05.001. [DOI] [PubMed] [Google Scholar]
  83. Weisman  CM, Murray  AW, Eddy  SR. 2020. Many, but not all, lineage-specific genes can be explained by homology detection failure. PLoS Biol. 18:e3000862. 10.1371/journal.pbio.3000862. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Wickham  H  et al.  2019. Welcome to the Tidyverse. J Open Source Softw. 4:1686. 10.21105/joss.01686. [DOI] [Google Scholar]
  85. Xu  M  et al.  2024. TERRA-LSD1 phase separation promotes R-loop formation for telomere maintenance in ALT cancer cells. Nat Commun. 15:2165. 10.1038/s41467-024-46509-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Yan  H, Bu  P. 2021. Non-coding RNA in cancer. Essays Biochem. 65:625–639. 10.1042/EBC20200032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Yu  G, Wang  L-G, Han  Y, He  Q-Y. 2012. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 16:284–287. 10.1089/omi.2011.0118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Zhang  L  et al.  2024. Integrative transcriptomic profiling of ncRNAs and mRNAs in developing mouse lens. Front Genet. 15:1405715. 10.3389/fgene.2024.1405715. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Zhang  Y, Yang  M, Yang  S, Hong  F. 2022. Role of noncoding RNAs and untranslated regions in cancer: a review. Medicine (Baltimore).  101:e30045. 10.1097/MD.0000000000030045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Zhao  F  et al.  2014. Comprehensive transcriptome profiling and functional analysis of the frog (Bombina maxima) immune system. DNA Res. 21:1–13. 10.1093/dnares/dst035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Zweifel  RG. 1926–1968. Reproductive biology of anurans of the arid Southwest, with emphasis on adaptation of embryos to temperature. Bull Am Mus Nat Hist. 140:1–64. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

jkaf283_Supplementary_Data

Data Availability Statement

Both raw sequence files (SRA) and transcriptomes of individual samples and the global reference were deposited at NCBI under BioProject ID PRJNA1295574. The code used for this paper and a comprehensive list of software and versions can be found in the GitHub repository: https://github.com/DarkHe007/LncRNAs-in-Dryophytes-arenicolor/tree/main

Supplemental material available at G3 online.


Articles from G3: Genes | Genomes | Genetics are provided here courtesy of Oxford University Press

RESOURCES