Skip to main content
RNA Biology logoLink to RNA Biology
. 2019 Aug 25;17(11):1674–1679. doi: 10.1080/15476286.2019.1657744

tRic: a user-friendly data portal to explore the expression landscape of tRNAs in human cancers

Zhao Zhang a,#, Hang Ruan a,#, Chun-Jie Liu b,#, Youqiong Ye a, Jing Gong a, Lixia Diao c,, An-Yuan Guo b,, Leng Han a,d,
PMCID: PMC7567503  PMID: 31432762

ABSTRACT

Transfer RNAs (tRNAs) play critical roles in human cancer. Currently, no database provides the expression landscape and clinical relevance of tRNAs across a variety of human cancers. Utilizing miRNA-seq data from The Cancer Genome Atlas, we quantified the relative expression of tRNA genes and merged them into the codon level and amino level across 31 cancer types. The expression of tRNAs is associated with clinical features of patient smoking history and overall survival, and disease stage, subtype, and grade. We further analysed codon frequency and amino acid frequency for each protein coding gene and linked alterations of tRNA expression with protein translational efficiency. We include these data resources in a user-friendly data portal, tRic (tRNA in cancer, https://hanlab.uth.edu/tRic/ or http://bioinfo.life.hust.edu.cn/tRic/), which can be of significant interest to the research community.

KEYWORDS: tRNA, codon, amino acid, cancer, codon usage

Introduction

Transfer RNAs (tRNAs) play critical roles in protein translation by delivering amino acids to initiate and elongate peptide chains [1]. Transcription of tRNAs is mediated by RNA polymerase III, and aberrant tRNA expression contributes to disease [2,3]. For example, overexpression of tRNAiMetCAT (initiator tRNA that identifies a methionyl translation start codon) can enhance global protein synthesis and increase endoplasmic reticulum stress to promote the development of diabetes [4]. Decreased expression of tRNAGlnCTG promotes progression of Huntington’s disease in the early stage by increasing the frequency of translational frame-shifting [5]. In human cancers, enhanced tRNA expression drives mRNA translation and cell growth [6]. For example, expression of tRNAArg in breast cancer is positively correlated with codon frequency in oncogenic signatures, suggesting that tRNAArg overexpression may accelerate the translational efficiency of these oncogenic genes [79]. Up-regulation of tRNAGluTTC optimizes EXOSC2 expression to promote metastatic progression of tumours[10].

The Cancer Genome Atlas (TCGA) project generated multi-omic data for more than 10,000 patient samples, including exome-seq, RNA-seq, miRNA-seq, and DNA methylation[11]. It also collected clinical features, including disease stage and patient age and overall survival. These rich data provide valuable opportunities to understand transcriptomic events and oncogenic pathways [1216]. Several databases have been developed to benefit the biomedical research community in utilizing this large-scale dataset. For example, cBioPortal provides a web resource for exploring, visualizing, and analysing cancer genomic data, especially for protein-coding genes [17,18]. The Cancer Proteome Atlas includes protein expressions of ~200 proteins for > 8,000 tumour samples[19]. PancanQTL was developed to explore both trans-quantitative trait loci (QTL) and cis-eQTL across 33 cancer types[20]. Several other databases focus on non-coding RNAs. For example, The Atlas of Non-coding RNA In Cancer focuses on the functions and clinical relevance of long non-coding RNAs[21], while SnoRNA In Cancer focuses on the expression landscape and clinical relevance of small nucleolar RNAs[22]. However, there is still no tRNA database in cancer, likely due to the technical difficulty of estimating tRNA expression levels accurately from high-throughput sequencing data[23]. Recent studies used miRNA-seq to quantify the relative expression level of tRNAs in multiple organisms, including E.coli, yeast, and humans [2432]. In particular, we used a similar computational pipeline to quantify the relative expression levels of tRNAs from TCGA[33]. We further built a user-friendly database, tRNA In Cancer (tRic), the first comprehensive database for tRNAs in cancer, which can significantly benefit cancer research.

Results and discussion

Data preparation

We collected clinical information, including stage, grade, subtype, patient survival, and smoking history, from ~10,000 patients across 31 human cancers (Figure 1a). We obtained miRNA seq files for these samples and quantified their expression profile at tRNA, codon and amino acid level as described in our previous study (method and Figure 1b)[33]. We also calculated frequency of codon and amino acid for each coding gene throughout human genomes (Figure 1b). These datasets were deposited in our database.

Figure 1.

Figure 1.

Data processing and web design of tRic. a. Summary of clinical information across 31 human cancer types in tRic. Full names of cancer type are listed in Table 1. b. Data collection and processing of tRic dataset, including miRNAseq, tRNA annotation and human coding sequences (CDS). QC denotes quantify control. c. Interface and infrastructure of tRic.

Database infrastructure

The web interface is based on traditional HTML, CSS, and JavaScript with modern libraries, such as Bootstrap and JQuery. The backend of the data portal is based on R and data manipulation libraries, such as Tidyverse. The Django web framework is adopted to connect the backend and frontend of the database (Figure 1c). Users can browse or query items of interest on the user-friendly web pages. We established two mirrored links for tRic at https://hanlab.uth.edu/tRic/ or http://bioinfo.life.hust.edu.cn/tRic/). We will continue to support the database for possible updates.

Functional modules and examples

tRic has four functional modules: tRNA level, codon level, amino acid level, and codon usage (Figure 2a). In the ‘tRNA level’ module, users can query expression level of tRNAs in a specific cancer type and/or subgroup. tRic will return the expression level of tRNAs and differentially expressed tRNAs between tumour and normal samples if there were more than 5 paired samples. For example, tRNA-His-GTG-1–9 is differentially expressed between tumour and normal samples in LUAD (Figure 2b). Users can also choose to perform comprehensive analysis for tRNAs associated with clinical features. For example, tRNA-Arg-TCG-5–1 is associated with patient survival in KIRC (Figure 2c). Expression at tRNA level was merged into codon level and amino acid level. tRic also provides similar query functions in module ‘codon level’ and module ‘amino acid level’ to ‘tRNA level’. For example, tRNAArg(CGT) is differentially expressed among KIRC stages (Figure 2d), while tRNAArg(AGA) is differentially expressed among BRCA subtypes (Figure 2e), tRNAGlu is differentially expressed among patients with different smoking histories in LUSC (Figure 2f), and tRNALeu is differentially expressed among LIHC grades (Figure 2g).

Figure 2.

Figure 2.

Overview of tRic database. a. Four modules in tRic: expression of tRNs at tRNA level, codon level, and amino acid level, respectively, as well as codon usage. b. Differentially expressed tRNAs between tumour and normal samples. c. Expression of tRNA associated with patient survival. d. Differentially expressed codons among different stages. e. Differentially expressed codons among different subtypes. f. Differentially expressed amino acids among patients with different smoking histories. g. Differentially expressed amino acids among different tumour grades. h. Amino acid frequency of human SRSF2 gene.

tRNAs play important translation roles in initiating and elongating peptides[1]. Therefore, the expression alterations of tRNA may impact translational efficiency. The module ‘codon usage’ aims to pinpoint potential effects of tRNA expression on protein translation. Users can search a protein-coding gene for its codon frequency and amino acid frequency. For example, Arg frequency in SRSF2 (23.8%) is significantly higher than the average genomic level (5.5%), suggesting that tRNAArg overexpression may increase the translational product of SRSF2 (Figure 2h). Users can also search the gene list with high frequency for specific codons or amino acids.

Data download

Expressions at tRNA, codon, and amino acid levels, as well as the codon and amino acid frequency for all protein-coding genes are available on tRic download pages (https://hanlab.uth.edu/tRic/download/ or http://bioinfo.life.hust.edu.cn/tRic/download/).

Conclusion

We have developed the first comprehensive database for tRNA expression in more than 10,000 tumour samples across 31 cancer types. We provide the tRNA expression profile, differential expression between tumour and normal samples and among different groups of samples (e.g., subtypes, stages) at tRNA, codon and amino acid levels. We also provide the codon frequency and amino acid frequency for all protein-coding genes in the human genome, which may unveil potential connections between tRNA expression and the usage bias of gene translation. Our database will provide the biomedical research community with insights in functional discoveries of tRNAs in cancer.

Materials and methods

Clinical information for TCGA samples

The clinical information of TCGA samples was obtained from TCGA data portal (https://portal.gdc.cancer.gov/). Clinical information for each cancer type, including stage, grade, subtype, and patient survival and smoking history, is summarized in Figure 1a.

Quantification of tRNAs

We downloaded and processed 16,591 miRNA-seq data from TCGA data portal (https://portal.gdc.cancer.gov/) as we previously described[22]. In brief, we filtered out duplicated samples and low-quality samples with quality control-passed reads < 50% or reads mapped rate < 80%. After quality control, 10,594 samples, comprising 9931 tumour samples and 663 normal samples, were included in our study (Table 1, Figure 1b, left panel).

Table 1.

Summary of tRic data for each cancer type.

Abbreviation Cancer type No. of tumour samples No. of normal samples
ACC Adrenocortical carcinoma 80 0
BLCA Bladder urothelial carcinoma 397 16
BRCA Breast invasive carcinoma 1077 104
CESC Cervical squamous cell carcinoma and endocervical adenocarcinoma 295 3
CHOL Cholangiocarcinoma 36 9
COAD Colon adenocarcinoma 433 1
DLBC Lymphoid neoplasm diffuse large B-cell lymphoma 47 0
ESCA Oesophageal carcinoma 184 11
HNSC Head and neck squamous cell carcinoma 523 44
KICH Kidney chromophobe 66 25
KIRC Kidney renal clear cell carcinoma 516 71
KIRP Kidney renal papillary cell carcinoma 290 34
LGG Brain lower grade glioma 512 0
LIHC Liver hepatocellular carcinoma 372 50
LUAD Lung adenocarcinoma 513 46
LUSC Lung squamous cell carcinoma 476 45
MESO Mesothelioma 87 0
OV Ovarian serous cystadenocarcinoma 466 0
PAAD Pancreatic adenocarcinoma 178 4
PCPG Pheochromocytoma and paraganglioma 179 3
READ Rectum adenocarcinoma 160 0
PRAD Prostate adenocarcinoma 483 52
SARC Sarcoma 246 0
SKCM Skin cutaneous melanoma 447 2
STAD Stomach adenocarcinoma 409 37
TGCT Testicular germ cell tumours 150 0
THCA Thyroid carcinoma 510 71
THYM Thymoma 124 2
UCEC Uterine corpus endometrial carcinoma 538 33
UCS Uterine carcinosarcoma 57 0
UVM Uveal melanoma 80 0
Total   9931 663

We quantified tRNA expression levels as previously described[33]. In brief, we downloaded tRNA annotations from UCSC Genome Browser (http://hgdownload.soe.ucsc.edu/) and filtered out those without clear anticodon and amino acid information. In total, we collected 604 tRNAs decoding 52 anticodons (codons) and 21 amino acids. We then mapped TCGA miRNA-seq reads to tRNA annotations and normalized tRNA expression using the trimmed mean of M values (TMM) method [34,35]. We defined tRNAs that have relatively high expression value (average TMM > 1) as detectable tRNAs. These tRNAs were categorized into 52 codon groups and 21 amino acid groups according to the codon and amino acid information (Figure 1b, middle panel).

Estimation of codon frequency and amino acid frequency

The human coding sequences with complete open reading frames were downloaded from Ensembl database (www.ensembl.org/). For each coding gene, we estimated the frequency for each codon and each amino acid based on the sequence information. At the codon level, we calculated the total number of codons (N) and then calculated the total number of each specific codon (n). The codon frequency is calculated as N divided by n. We used a similar approach to calculate the amino acid frequency (Figure 1b, right panel).

Statistical analyses

All statistical tests were performed using R. We used the Student’s t-test to examine the differential expression between tumour and normal samples. The analysis of variance test was used to test differentially expressed tRNAs among different stages, subtypes, grades, and smoking history groups. The univariate Cox model was used to test if tRNA expression correlated with patient survival.

Supplementary Material

Supplemental Material

Acknowledgments

This work was supported by the Cancer Prevention & Research Institute of Texas (RR150085) to CPRIT Scholar in Cancer Research (L.H.); UTHealth Innovation for Cancer Prevention Research Training Program Post-doctoral Fellowship (Cancer Prevention and Research Institute of Texas, RP160015); China Postdoctoral Science Foundation (2019M652623 to C-J. Liu); National Natural Science Foundation of China (31822030 and 31771458 to A-Y. Guo). We gratefully acknowledge contributions from TCGA Research Network. We thank LeeAnn Chastain for editorial assistance.

Funding Statement

This work was supported by the Cancer Prevention & Research Institute of Texas (RR150085) to CPRIT Scholar in Cancer Research.

Authors’ contributions

L.H. conceived and supervised the project. Z.Z., Y.Y., C-J.L., H.R., J.G., L.D., A-Y.G., and L.H. performed the analyses. Z.Z, H.R., C-J.L., and A-Y.G. developed the database. Z.Z., H.R., L.D., and L.H. wrote the manuscript with input from all other authors.

Disclosure statement

No potential conflict of interest was reported by the authors.

Supplementary material

Supplemental data for this article can be accessed here

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material

Articles from RNA Biology are provided here courtesy of Taylor & Francis

RESOURCES