Skip to main content
RNA Biology logoLink to RNA Biology
. 2021 Jul 21;18(Suppl 1):430–438. doi: 10.1080/15476286.2021.1952759

Hairpin sequence and structure is associated with features of isomiR biogenesis

Anton Zhiyanov 1,, Stepan Nersisyan 1,✉,, Alexander Tonevitsky 1
PMCID: PMC8677047  PMID: 34286662

ABSTRACT

MiRNA isoforms (isomiRs) are single stranded small RNAs originating from the same pri-miRNA hairpin as a result of cleavage by Drosha and Dicer enzymes. Variations at the 5ʹ-end of a miRNA alter the seed region of the molecule, thus affecting the targetome of the miRNA. In this manuscript, we analysed the distribution of miRNA cleavage positions across 31 different cancers using miRNA sequencing data of TCGA project. As a result, we found that the processing positions are not tissue specific and that all miRNAs could be correctly classified as ones exhibiting homogeneous or heterogeneous cleavage at one of the four cleavage sites. In 42% of cases (42 out of 100 miRNAs), we observed imprecise 5ʹ-end Dicer cleavage, while this fraction was only 14% for Drosha (14 out of 99). To the contrary, almost all cleavage sites of 3ʹ-ends (either Drosha or Dicer) were heterogeneous. With the use of only four nucleotides surrounding a 5ʹ-end Dicer cleavage position we built a model which allowed us to distinguish between homogeneous and heterogeneous cleavage with the reliable quality (ROC AUC = 0.68). Finally, we showed the possible applications of the study by the analysis of two 5ʹ-end isoforms originating from the same exogeneous shRNA hairpin. It turned out that the less expressed shRNA variant was functionally active, which led to the increased off-targeting. Thus, the obtained results could be applied to the design of shRNAs whose processing will result in a single 5ʹ-variant.

KEYWORDS: Isomir, miRNA, shRNA, biogenesis, drosha, dicer, tcga

Introduction

A microRNA (miRNA) is a short non-coding RNA consisting of 22 nucleotides in average. MiRNAs play an important role in regulation of gene expression via the mechanism of RNA interference. Namely, miRNA molecules bind to mRNA targets, which lead to mRNA degradation or translation repression [1]. Alterations in functional activity of miRNAs have major contributions to pathogenesis of multiple diseases including, for example, cancer [2,3] and viral infections [4,5].

Recent research has shown that miRNAs are present in cells in several forms called isomiRs: they differ from each other in 1–3 nucleotides at 5ʹ- and 3ʹ-ends of the molecule [6]. One of the key points is that different isomiRs of the same miRNA may have completely different target genes if the seed region of the molecule (nucleotides 2–7 counting from the 5ʹ-end) is affected [7–9]. Thus, 5ʹ-isomiRs could be considered as distinct miRNAs with their own sets of targets [10]. It is also important to note that in a large number of cases the canonical miRNA form is not the most expressed isomiR [11].

Mature isomiRs are produced from pri-miRNA hairpins as a result of Drosha and Dicer enzymatic cleavage [12]: Drosha cleaves out flanking segments of a stem loop in the cell nucleus, while Dicer cuts off the hairpin’s loop in the cytoplasm (Fig. 1). Uncertainty in the position of the cut leads to diversity of miRNA isoforms. It is still unclear which mechanisms and rules are responsible for precise or imprecise cleavage. For example, Starega-Roslan with co-authors showed that some sequence features near the Drosha and Dicer cleavage positions affect on whether the particular hairpin will lead to heterogeneous cleavage profile [13]. In another manuscript it was shown that the loop position of pre-miRNA affects accuracy of Dicer cleavage [14]. However, to the best of our knowledge, tissue specificity of the proposed models were not assessed.

Figure 1.

Figure 1.

The scheme of miRNA processing. drosha cleaves out the flanking segments of the hairpin: 5ʹ-end of 5ʹ arm (A) and 3ʹ-end of 3ʹ arm (C); dicer cleaves out the loop: 3ʹ-end of 5ʹ arm (B) and 5ʹ-end of 3ʹ arm (D)

In this work we performed pan-cancer bioinformatics analysis to answer the question whether the main Drosha/Dicer cleavage position is tissue specific, and whether the repertoire of isomiRs of the particular miRNA can depend on cancer type. We also developed the statistical model which allows one to distinguish between cleavage sites which will lead to only one mature miRNA (homogeneous cleavage) or to multiple ones (heterogeneous cleavage). To assess whether a particular shRNA could exhibit heterogeneous Dicer cleavage, we performed miRNA sequencing of MDA-MB-231 cell lines transduced by two different shRNAs. Possible miRNA-like off-target effects were studied using bioinformatics predictions and microarray transcriptomics analysis.

Materials and methods

Pan-cancer isomiR expression data processing

Processed miRNA-seq data (read count tables after mapping and miRNA annotation) of primary tumours from 31 TCGA projects were downloaded from GDC Data Portal (https://portal.gdc.cancer.gov) in form of *.isoforms.quantification.txt files. For each cancer type, we used edgeR v3.30.3 package [15] to convert raw read count tables to TMM-normalized reads per million mapped reads (TMM-RPM) matrices; default noise filtering procedure was used.

For each TCGA project we extracted top-20% most expressed isomiRs according to their median expression levels; only these isomiRs were used for the downstream analysis. Then, the most expressed 5ʹ- and 3ʹ-isomiRs (the main isomiRs) were identified for each miRNA in each cancer type. Shift of the main isomiR from the canonical miRBase form was calculated in 5ʹ- to 3ʹ-direction for both miRNA arms (either −5p or −3p). Aside from determining shifts, a fraction of reads matching the main isomiR was also calculated.

Motif analysis

Sequences of pri-miRNA hairpins and mature miRNAs were extracted from miRBase v21 [16] (TCGA-matched version). Statistical comparisons of motif frequencies between classes of homogeneous and heterogeneous cleavage were done with Fisher’s exact test. To account for multiple hypothesis testing we adjusted p values using the Benjamini-Hochberg method [17]. Finally, to depict the differences graphically we used the logomaker Python module [18].

Principal component analysis

Nucleotide sequences near the main 5ʹ-end Dicer cleavage sites were embedded into numerical space using one-hot encoding. Then, principal component analysis (PCA) was applied to reduce dimensionality of the data. Number of principal components (PCs) was determined in such a way that each PC explained at least 5% of data variance. Mann-Whitney’s U-test and area under the receiver operating characteristic curve (ROC AUC) were used to measure how well the obtained PCs differentiate between homogeneous and heterogeneous cleavage. To find out statistical significance of the resulting AUCs, we estimated p-values using N=104 random label permutations.

ShRNA transduction and miRNA sequencing

MDA-MB-231 cells were previously transduced with shRNAs against ELOVL5 (shELOVL5) and luciferase (shLuc) genes [19]. Mature shRNA sequences were encoded in 3ʹ arms of the following hairpins:

shELOVL5: 5ʹ GATCCGCGGAAGGATTGAAGTCAATTCAAGAGATTGACTTCAATCCTTCCGCTTTTTTACGCGTG 3ʹ,

shLuc: 5ʹ GATCCGTGCGTTGCTAGTACCAACTTCAAGAGAGTTGGTACTAGCAACGCACTTTTTTACGCGTG 3ʹ.

Libraries for miRNA sequencing were prepared from total RNA samples using NEBNext Multiplex Small RNA Library Prep Kit for Illumina. Each sample was sequenced on the Illumina NextSeq 550 to generate single-end 50 nucleotide reads. Quality of FASTQ files was assessed with FastQC v0.11.9 (Babraham Bioinformatics, Cambridge, UK). Adapters were trimmed with cutadapt v2.10 [20]. MiRNA count matrix was generated by miRDeep2 v2.0.1.2 [21] with the use of bowtie v1.1.1 [22] read mapper and GENCODE release 34 human genome [23].

ShRNAs off-target prediction

MiRNA-like off-targets of shRNAs were predicted using miRDB v6.0 [24]. Target predictions were filtered according to their Target Scores, threshold value was set to 80 as recommended by the authors of the tool. Downregulation of predicted target genes (according to the microarray transcriptomics data) was detected with Student’s t test as described in the previous publication [19].

Data and code availability

All source codes have been made available on GitHub (https://github.com/s-a-nersisyan/isomiR_biogenesis). Raw FASTQ files have been deposited in the Gene Expression Omnibus (GEO) under GSE173709 accession. Other relevant data are available within the manuscript and its supplemental material files.

Results

Drosha/Dicer main cleavage sites are stable across cancer types

We explored the landscape of isomiR expression in 31 cancer types available in TCGA using miRNA sequencing data. For each cancer type we composed a list of highly expressed miRNAs, and for each highly expressed miRNA we identified 5ʹ- and 3ʹ-shifts from the canonical miRBase form corresponding to the most expressed 5ʹ-isomiR and 3ʹ-isomiR (main 5ʹ- and 3ʹ-isomiRs). Additionally, for each miRNA we calculated the median fraction of reads corresponding to the main isomiR from the set of all reads mapped to the miRNA (median was calculated over all samples from a particular TCGA project). Thus, the main Drosha/Dicer cleavage positions and corresponding expression fractions were identified for all miRNAs expressed in various cancers.

First, we asked whether a particular miRNA can have different main cleavage positions in two different cancers. It turned out that generally the answer is no, especially for 5ʹ-ends (Fig. 2, Supplementary Table 1). Namely, 196 out of 199 miRNAs had the same main cleavage position at 5ʹ-end in all TCGA projects. Exceptions were miRNAs hsa-miR-140-3p, hsa-miR-183-5p and hsa-miR-192-5p: each of them had matching shifts in all except one cancer type. Moreover, in the majority of cases (187 miRNAs, 94%) the main cleavage position matched the canonical one from miRBase. Similarly, the most frequent 3ʹ-cleavage position was highly stable among different cancers, however, this position had not matched miRBase canonical annotation for 59 miRNAs (30%).

Figure 2.

Figure 2.

Distribution of main isomiR’s cleavage position shifts from the canonical miRBase form across different cancers. A fraction (value in a cell) is equal to x, if ’th part of TCGA projects in which corresponding miRNA is expressed have a specified cleavage shift. As in Fig. 1, (A) stands for 5ʹ-end of 5ʹ arm, (B) for 3ʹ-end of 5ʹ arm, (C) for 3ʹ-end of 3ʹ arm and (D) for 5ʹ-end of 3ʹ arm

Information about homogeneity or heterogeneity of miRNA cleavage patterns is independent of cancer type

Analysing the fraction of expression covered by the main isomiR of some miRNA, we separated miRNAs into two classes: if more than 95% of expression is concentrated in the main isomiR, we say that the corresponding miRNA exhibits homogeneous cleavage at the particular termini (5ʹ- or 3ʹ-) in the particular cancer; otherwise, cleavage profile is said to be heterogeneous. Interestingly, homogeneity or heterogeneity of cleavage profiles was actually independent of cancer type (Fig. 3, Supplementary Table 2). In other words, if miRNA cleavage was homogeneous in one TCGA project, it was highly probable to be homogeneous in all projects where it was expressed (and vice versa). Given that fact, we hypothesized that information about homogeneity or heterogeneity of cleavage patterns could be due to recognition of specific nucleotidic and structural features in pri-/pre-miRNA hairpins by Drosha and Dicer.

Figure 3.

Figure 3.

Distribution of expression fractions covered by the main isomiRs. empty cells correspond to miRNAs which are not expressed in the corresponding TCGA project (i.e. whose median expression levels are below the top-20% cut-off value). As in Fig. 1, (A) stands for 5ʹ-end of 5ʹ arm, (B) for 3ʹ-end of 5ʹ arm, (C) for 3ʹ-end of 3ʹ arm and (D) for 5ʹ-end of 3ʹ arm

Interestingly, number of miRNAs having two or more highly expressed 5ʹ-isomiRs (i.e. ones with heterogeneous 5ʹ-cleavage patterns) was significantly higher among 3ʹ arms: 42 out of 100 versus 14 out of 99 for 5ʹ arms (Fisher’s exact test p=1.58×105). Thus, imprecise 5ʹ-cleavage is more typical for Dicer than to Drosha. For 3ʹ-ends, the situation was completely opposite: almost all miRNAs had heterogeneous 3ʹ-cleavage (94/100 for 3ʹ arms and 94/99 for 5ʹ arms). Since the seed-region of miRNA is located on the 5ʹ-end of the molecule, heterogeneous 3ʹ-cleavage should not significantly affect miRNA targetome.

Local hairpin sequence features and motifs affect homogeneous and heterogeneous drosha/dicer cleavage

As a next step of our analysis we tested the hypothesis whether homogeneous or heterogeneous Drosha/Dicer cleavage is dependent on local miRNA hairpin sequence. For that, we extracted 8 nucleotides surrounding 5ʹ-/3ʹ- main cleavage sites of each miRNA (4 nucleotides left and right of the cleavage position), and performed comparative frequency analysis of homogeneous/heterogeneous cleavage patterns for each of four possible cleavage positions. For this step, mature miRNAs arising from more than one hairpin were discarded, since in case of heterogeneous cleavage we do not know which exact hairpin led to imprecise work of Drosha/Dicer.

Significant differences of nucleotide frequencies were observed when comparing miRNAs with homogeneous and heterogeneous Dicer cleavage (Fig. 4). For 3ʹ-end of 5ʹ arm clear pattern was identified: all miRNAs with homogeneous Dicer cleavage (hsa-miR-20a-5p, hsa-miR-20b-5p, hsa-miR-93-5p and hsa-miR-144-5p) had AG**U pattern near the cleavage position (* stands for one arbitrary nucleotide), while only 2 out of 72 entries (hsa-miR-151a-5p, hsa-miR-424-5p) were found in case of heterogeneous cleavage (Fisher’s exact test adjusted p=0.0233). For the second Dicer cleavage position (5ʹ-end of 3ʹ arm), frequency of uracil preceding a cleavage site was near zero in homogeneous cleavage (2 out of 48 miRNAs), while it accounted for about a third of miRNAs with heterogeneous Dicer cleavage (10 out of 36). This difference was statistically significant according to Fisher’s exact test (p=3.42×103), though the significance was absent after multiple testing correction (adjusted p=0.109).

Figure 4.

Figure 4.

Sequence motifs near the main drosha/dicer cleavage sites. four nucleotides surrounding the main cleavage position are shown. In each subfigure (A, B, C, D), the upper sequence logo corresponds to homogeneous cleavage and the bottom one to heterogeneous cleavage. as in Fig. 1, (A) stands for 5ʹ-end of 5ʹ arm, (B) for 3ʹ-end of 5ʹ arm, (C) for 3ʹ-end of 3ʹ arm and (D) for 5ʹ-end of 3ʹ arm

To account for pri-miRNA secondary structure, experiments were performed with gaps introduced into sequences. Analysis of presence of specific motifs revealed that adenine followed by a single bulge was uncharacteristic for homogeneous 5ʹ-end Dicer cleavage: not a single homogeneous site contained this motif as opposed to 7 out of 36 heterogeneous miRNAs (adjusted p=0.0461).

Only one feature was detected for Drosha cleavage sites (3ʹ-end of 3ʹ arm). Namely, CU*U pattern was present in 5 out of 6 heterogeneous miRNAs and only in 3 out of 68 homogeneous ones (adjusted p=0.0302). The analysis performed did not allow us to find statistically significant differences between homogeneous/heterogeneous Drosha cleavage sites at 5ʹ-end of 5ʹ arm.

Analysis of local hairpin sequence allows one to distinguish between homogeneous and heterogeneous 5ʹ-end Dicer cleavage with reliable quality

Dicer cleavage of 5ʹ-ends of 3ʹ miRNA arms was the only case for which we observed many examples of both heterogeneous (36 miRNAs) and homogeneous (48 miRNAs) cleavage patterns: the majority of Drosha cleavage sites at 5ʹ-ends of 5ʹ arms were homogeneous, and almost all cleavage sites at 3ʹ-ends (either Drosha or Dicer) were heterogeneous. While in the previous section we presented some local sequence features specific for heterogeneous cleavage, these results were insufficient to build a classifier which will distinguish between heterogeneous and homogeneous 5ʹ-end Dicer cleavage with a reliable quality.

To tackle this problem, we mapped four nucleotides surrounding 5ʹ-end main cleavage site of 3ʹ arm miRNAs into numerical space using one-hot encoding. Specifically, adenine, uracil, guanine, cytosine and gap were mapped to vectors (1, 0, 0, 0, 0), (0, 1, 0, 0, 0), (0, 0, 1, 0, 0), (0, 0, 0, 1, 0) and (0, 0, 0, 0, 1), respectively. For example, sequence ‘UAG-’ was mapped to the vector (0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1). Then, principal component analysis (PCA) was used to reduce dimensionality of the data.

It turned out that one of the principal components (PC4) had significantly higher values in the group of 3ʹ miRNA arms with heterogeneous cleavage at 5ʹ-ends compared to homogeneous ones (Mann-Whitney’s U-test p=2.14×103, Fig. 5A). Moreover, the obtained score separated two groups with the area under the receiver operating characteristic curve (ROC AUC) equal to 0.68 (permutation test p=3.80×103, Fig. 5B). Thus, unsupervised PCA analysis captured differences between homogeneous and heterogeneous cleavage sites using only four nucleotides surrounding the main cleavage position. Contribution of different nucleotides at each position to the PC4 is illustrated in Fig. 5C.

Figure 5.

Figure 5.

Principal component analysis allows one to distinguish between homogeneous and heterogeneous 5ʹ-end dicer cleavage. (A) distribution of PC4 scores in classes of homogeneous and heterogeneous cleavage. (B) receiver operating characteristic (ROC) curve of PC4 score. (C) contribution of different features to PC4. positive weights are associated with homogeneous cleavage

Heterogeneous 5ʹ-end dicer cleavage of shRNAs

Short hairpin RNAs, which are commonly used for specific gene silencing, follow the same Dicer processing as endogenous miRNAs. Given that, we hypothesized that heterogeneous Dicer cleavage could lead to multiple 5ʹ-shRNA isoforms. While existence of such variants should not affect much perfect pairing with the target gene, it could lead to increased diversity of miRNA-like off-targeting. In order to study the possibility of such events, we performed miRNA sequencing of MDA-MB-231 cells transduced with shRNAs against ELOVL5 (shELOVL5) and luciferase (shLuc) genes, which we previously studied [19].

It turned out that two 5ʹ-variants were present for shELOVL5: they accounted to 84% (variant 1) and 16% (variant 2) of all sequencing reads (6644 RPM in total); the variant 1 matched the pre-designed mature sequence (Fig. 6). To the contrary, there was only one 5ʹ-variant of shLuc accounting for about 99.6% of reads (total expression – 17,098 RPM), and this variant did not match the desired mature sequence (a single guanine was missing at the 5ʹ-end, Fig. 6). While the loop sequences of both hairpins were exactly the same, the nucleotide sequences near the main cleavage positions were different: GAUU for shELOVL5 and AGUU for shLuc. According to Fig. 4C, GA motif preceding the main cleavage site matched a homogeneous cleavage pattern, which agreed with the observed miRNA-seq data.

Figure 6.

Figure 6.

Cleavage of shELOVL5 and shLuc hairpins. Green colour stands for pre-designed mature sequences (reverse-complements of the desired mRNA target regions), blue colour stands for seed regions

Since seed regions of two shELOVL5 variants were different, we hypothesized that they could have different miRNA-like targetomes. In order to validate this hypothesis, we predicted the target genes of two shELOVL5 variants using the miRDB tool. The sets of 157 and 190 targets shared only 44 common entries (Fig. 7A). Then, using microarray gene expression data we assessed how many of predicted target mRNAs were downregulated upon shELOVL5 transduction. Surprisingly, the expression levels of seven targets of shELOVL5 variant 2 were decreased, which was a statistically significant fraction of the predictions (hypergeometric test p=1.37×103). To the contrary, only one of predicted targets of shELOVL5 variant 1 was downregulated (Fig. 7B), which was not statistically significant (p=0.491). These observations highlight the possible off-target effects related to miRNA-like activity of one of shRNA isoforms.

Figure 7.

Figure 7.

MiRNA-like off-targets of two shELOVL5 isoforms. (A) numbers of predicted targets of two shELOVL5 variants. (B) downregulation of shELOVL5 variant 1 and variant 2 target genes on mRNA level. LRRC40 is the only downregulated target of variant 1, seven other genes are targets of variant 2

Discussion

In this manuscript we studied the distribution of isomiRs in multiple TCGA miRNA-seq datasets. For each miRNA arm the main (i.e. the most expressed) isomiR was identified. Based on pan-cancer analysis we concluded that the main Drosha/Dicer cleavage sites were generally not tissue specific, especially for 5ʹ-ends. Moreover, the expression fraction covered by the main isomiR was also stable across different cancer types. Based on these observations, we divided four processing positions of each pri-miRNA into two groups: ones which cleavage results in a single major isomiR (homogeneous cleavage) or at least two highly expressed variants (heterogeneous cleavage). The overwhelming majority of miRNAs gave a raise to various 3ʹ-isomiRs, while only 28% of miRNAs had variability at 5ʹ-ends. These results are in consistence with the existing literature [11,25,26], though the percentage of 5ʹ-variants varies from study to study due to different counting and filtering strategies. For example, Loher et al reported that only 9% of isomiRs had modified 5ʹ-ends compared to the reference miRBase entry [11], while this number was 5–15% according to Tan with co-authors [25]. Note, that this quantity differs from the one (28%) we reported (number of miRNAs with heterogeneous 5ʹ-end cleavage). According to our data, 9–17% out of all isomiRs had 5ʹ-end modifications, which fully agrees with the literature. Interestingly, Dicer cleavage of 5ʹ-ends of 3ʹ arms led to significantly higher number of variants compared to more accurate Drosha 5ʹ-end cleavage. This phenomenon was already mentioned by Starega-Roslan et al [13].

In the previous study Telonis with co-authors analysed the distribution of isomiRs expression in the same TCGA samples [27]. Based on the information about presence or absence of isomiRs in a sample, authors built the machine learning model which discriminated different cancer types with very high accuracy. Differential isomiR expression was also found when comparing primary tumours with adjacent normal tissues. While these results mean high tissue/disease specificity of different isomiRs expression, our results suggest more conservative expression pattern within isomiRs originating from the same miRNA arm, including the main variant and cleavage homogeneity/heterogeneity. Telonis et al implicitly demonstrated this effect: quality of the cancer type classification remained high when binarized miRNA expression was used instead of isomiR data (i.e. an isomiR could be substituted with parental miRNA without the essential loss of information).

Comparison of sequences and structures of pri-miRNA hairpins from classes of homogeneous and heterogeneous cleavage led to several differentially present motifs at sites of Drosha and Dicer processing. While the same strategy was already employed by Starega-Roslan with co-authors [13], they studied miRNAs expressed only in one cell line (HEK293T). To the contrary, we used miRNA expression data collected from 31 TCGA projects representing different human tissues. We also used much more stringent thresholds for selecting the most abundant isomiRs to overcome possible issues caused by noise associated with low expressed transcripts [28]. Further PCA analysis allowed us to formally quantify the difference between classes of homogeneous and heterogeneous Dicer cleavage at 5ʹ-end of 3ʹ miRNA arms (ROC AUC = 0.68). While application of supervised machine learning instead of unsupervised one could improve the obtained quality, the existing sample size in insufficient to split the list of miRNAs into training and validation sets. Moreover, miRNA-specific mechanisms of maturation and processing also exist. For example, it was recently shown that each miRNA has a spectrum of associated RNA-binding proteins which could regulate their biogenesis [29,30]. Another example of miRNA-specific biogenesis regulation was given by Kim et al: uridylation of pre-miR-324 results in alternative Dicer cleavage and consecutive arm switching [31].

Besides the theoretical value, the obtained results could be applied to the biogenesis of exogenous shRNAs [32,33], which are still widely used for gene silencing. Usually, sequences of shRNAs are specifically designed, so no highly complementary off-target genes exists [34–36]. However, multiple reports suggest that the seed region miRNA-like base pairing is a major driver of off-target effects [37–41]. Given that, we hypothesized that commonly used shRNAs with 3ʹ guide arms [42] could undergo heterogeneous Dicer cleavage at the 5ʹ-end. With the use of miRNA sequencing, we showed an example of such shRNA targeting ELOVL5 gene. Nucleotide sequences of this hairpin and the homogeneous control shRNA agreed with the previously described PCA model. Next, we performed bioinformatics prediction of possible miRNA-like off-targets of the two shRNA variants followed by the microarray transcriptomics data analysis. Despite the high false positives rate associated with computational miRNA target prediction [43–45], the statistically significant fraction of predicted targets of the less expressed shRNA variant was downregulated upon shRNA transduction (7 out of 190 targets, p=1.37×103). Though each of presented interactions should be independently validated (e.g. with luciferase reporter assays), the observed data suggest that the unexpected shRNA isoform contributed to miRNA-like off-targeting. Further research of such effects is warranted, since off-targets effects could have severe implications in both fundamental studies and clinical applications. For example, shRNA-mediated knockdowns followed by high-throughput transcriptomics/proteomics analyses are widely used to study functions and pathways associated with a particular gene, and off-targets could add much noise in the data and activate pathways which are not related to the gene of interest [46]. Off-target effects could also lead to unwanted toxicity, which is one the main challenges in the clinical applications of RNA interference [39,47].

Supplementary Material

Supplemental Material

Acknowledgements

The research was performed within the framework of the Laboratory of Molecular Physiology at HSE University. The authors thank Dr Maxim Shkurnikov for valuable comments and discussions.

Funding Statement

This research was performed within the framework of the Laboratory of Molecular Physiology at HSE University.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Supplementary material

Supplemental data for this article can be accessed here.

References

  • [1].Nilsen TW. Mechanisms of microRNA-mediated gene regulation in animal cells. Trends Genet. 2007;23(5):243–249. [DOI] [PubMed] [Google Scholar]
  • [2].Visone R, Croce CM. MiRNAs and cancer. Am J Pathol. 2009;174(4):1131–1138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Nersisyan S, Galatenko A, Galatenko V, et al. miRGTF-net: integrative miRNA-gene-TF network analysis reveals key drivers of breast cancer recurrence. PLoS One. 2021;16(4):e0249424. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Shkurnikov MY, Nersisyan SA, Osepyan AS, et al. Differences in the drosha and dicer cleavage profiles in colorectal cancer and normal colon tissue samples. Dokl Biochem Biophys. 2020;493(1):208–210. [DOI] [PubMed] [Google Scholar]
  • [5].Nersisyan S, Engibaryan N, Gorbonos A, et al. Potential role of cellular miRNAs in coronavirus-host interplay. PeerJ. 2020;8:e9994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Morin RD, O’Connor MD, Griffith M, et al. Application of massively parallel sequencing to microRNA profiling and discovery in human embryonic stem cells. Genome Res. 2008;18(4):610–621. . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].van der Kwast RVCT, Woudenberg T, Quax PHA, et al. MicroRNA-411 and Its 5′-IsomiR have distinct targets and functions and are differentially regulated in the vasculature under ischemia. Mol Ther [Internet]. 2020; 28(1):157–170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Mercey O, Popa A, Cavard A, et al. Characterizing isomiR variants within the micro RNA −34/449 family. FEBS Lett. 2017;591(5):693–705. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Salem O, Erdem N, Jung J, et al. The highly expressed 5ʹisomiR of hsa-miR-140-3p contributes to the tumor-suppressive effects of miR-140 by reducing breast cancer proliferation and migration. BMC Genomics. 2016;17:566. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Telonis AG, Loher P, Jing Y, et al. Beyond the one-locus-one-miRNA paradigm: microRNA isoforms enable deeper insights into breast cancer heterogeneity. Nucleic Acids Res. 2015;43(19):9158–9175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Loher P, Londin ER, Rigoutsos I. IsomiR expression profiles in human lymphoblastoid cell lines exhibit population and gender dependencies. Oncotarget. 2014;5(18):8790–8802. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Kim YK, Kim B, Kim VN. Re-evaluation of the roles of DROSHA,Exportin 5, and DICER in microRNA biogenesis. Proc Natl Acad Sci U S A. 2016;113(13):E1881–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Starega-Roslan J, Witkos TM, Galka-Marciniak P, et al. Sequence features of drosha and dicer cleavage sites affect the complexity of IsomiRs. Int J Mol Sci. 2015;16(12):8110–8127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [14].Gu S, Jin L, Zhang Y, et al. The Loop Position of shRNAs and Pre-miRNAs Is Critical for the Accuracy of Dicer Processing In Vivo. Cell. 2012;151(4):900–911. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2009;26(1):139–140. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [16].Kozomara A, Birgaoanu M, Griffiths-Jones S. MiRBase: from microRNA sequences to function. Nucleic Acids Res. 2019;47(D1):D155–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [17].Benjamini Y, Hochberg Y. Controlling the False Discovery Rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B [Internet]. 1995; 57:289–300. [Google Scholar]
  • [18].Tareen A, Kinney JB. Logomaker: beautiful sequence logos in Python. Bioinformatics [Internet]. 2020; 36(7):2272–2274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Nikulin S, Zakharova G, Poloznikov A, et al. Effect of the Expression of ELOVL5 and IGFBP6 genes on the metastatic potential of breast cancer cells. Front Genet [Internet]. 2021; 12:662843. https://www.frontiersin.org/articles/ 10.3389/fgene.2021.662843/full [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [20].Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal. 2011;17(1):10. [Google Scholar]
  • [21].Friedländer MR, MacKowiak SD, Li N, et al. MiRDeep2 accurately identifies known and hundreds of novel microRNA genes in seven animal clades. Nucleic Acids Res. 2012;40(1):37–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [22].Langmead B, Trapnell C, Pop M, et al. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10(3):R25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Frankish A, Diekhans M, Ferreira AM, et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 2019;47(D1):D766–73. . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].Chen Y, Wang X. MiRDB: an online database for prediction of functional microRNA targets. Nucleic Acids Res. 2020;48(D1):D127–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25].Tan GC, Chan E, Molnar A, et al. 5′ isomiR variation is of functional and evolutionary importance. Nucleic Acids Res [Internet]. 2014;42(14):9424–9435. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Guo L, Liang T. MicroRNAs and their variants in an RNA world: implications for complex interactions and diverse roles in an RNA regulatory network. Brief Bioinform [Internet]. 2016;19(2):245–253. https://academic.oup.com/bib/article-lookup/doi/ 10.1093/bib/bbw124 [DOI] [PubMed] [Google Scholar]
  • [27].Telonis AG, Magee R, Loher P, et al. Knowledge about the presence or absence of miRNA isoforms (isomiRs) can successfully discriminate amongst 32 TCGA cancer types. Nucleic Acids Res. 2017;45(6):2973–2985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [29].Treiber T, Treiber N, Plessmann U, et al. A Compendium of RNA-Binding Proteins that Regulate MicroRNA Biogenesis. Mol Cell. 2017;66(2):270–284.e13. [DOI] [PubMed] [Google Scholar]
  • [30].Nussbacher JK, Yeo GW. Systematic Discovery of RNA Binding Proteins that Regulate MicroRNA Levels. Mol Cell. 2018;69(6):1005–1016.e7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [31].Kim H, Kim J, Yu S, et al. A Mechanism for microRNA arm switching regulated by uridylation. Mol Cell [Internet]. 2020; 78(6):1224–1236.e5. [DOI] [PubMed] [Google Scholar]
  • [32].Herold MJ, Van Den Brandt J, Seibler J, et al. Inducible and reversible gene silencing by stable integration of an shRNA-encoding lentivirus in transgenic rats. Proc Natl Acad Sci [Internet]. 2008;105(47):18507–18512. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [33].Sliva K, Schnierle BS. Selective gene silencing by viral delivery of short hairpin RNA. Virol J [Internet]. 2010; 7(1):248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Gong W, Ren Y, Zhou H, et al. siDRM: an effective and generally applicable online siRNA design tool. Bioinformatics [Internet]. 2008; 24(20):2405–2406. [DOI] [PubMed] [Google Scholar]
  • [35].Jackson AL, Bartz SR, Schelter J, et al. Expression profiling reveals off-target gene regulation by RNAi. Nat Biotechnol [Internet]. 2003; 21(6):635–637. [DOI] [PubMed] [Google Scholar]
  • [36].Yamada T, Morishita S. Accelerated off-target search algorithm for siRNA. Bioinformatics [Internet]. 2005; 21(8):1316–1324. [DOI] [PubMed] [Google Scholar]
  • [37].Birmingham A, Anderson EM, Reynolds A, et al. 3′ UTR seed matches, but not overall identity, are associated with RNAi off-targets. Nat Methods [Internet]. 2006;3(3):199–204. [DOI] [PubMed] [Google Scholar]
  • [38].Jackson AL. Widespread siRNA “off-target” transcript silencing mediated by seed region sequence complementarity. RNA [Internet]. 2006; 12(7):1179–1187. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [39].Jackson AL, Linsley PS. Recognizing and avoiding siRNA off-target effects for target identification and therapeutic application. Nat Rev Drug Discov [Internet]. 2010;9(1):57–67. [DOI] [PubMed] [Google Scholar]
  • [40].Naito Y, Yoshimura J, Morishita S, et al. siDirect 2.0: updated software for designing functional siRNA with reduced seed-dependent off-target effect. BMC Bioinformatics [Internet]. 2009; 10(1):392. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [41].Naito Y, Ui-Tei K. siRNA Design Software for a Target Gene-Specific RNA Interference. Front Genet [Internet]. 2012; 3:102. http://journal.frontiersin.org/article/ 10.3389/fgene.2012.00102/abstract [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [42].Bofill-De Ros X, Gu S. Guidelines for the optimal design of miRNA-based shRNAs. Methods [Internet]. 2016; 103:157–166. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [43].Pinzón N, Li B, Martinez L, et al. microRNA target prediction programs predict many false positives. Genome Res [Internet] 2017; 27(2):234–245. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [44].Seitz H. Issues in current microRNA target identification methods. RNA Biol [Internet] 2017; 14(7):831–834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [45].Fridrich A, Hazan Y, Moran Y. Too many false targets for MicroRNAs: challenges and Pitfalls in Prediction of miRNA targets and their gene ontology in model and non‐model organisms. BioEssays [Internet] 2019; 41(4):1800169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [46].Lin X. siRNA-mediated off-target gene silencing triggered by a 7 nt complementation. Nucleic Acids Res [Internet]. 2005;33(14):4527–4535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [47].Fedorov Y. Off-target effects by siRNA can induce toxic phenotype. RNA [Internet] 2006; 12(7):1188–1196. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material

Data Availability Statement

All source codes have been made available on GitHub (https://github.com/s-a-nersisyan/isomiR_biogenesis). Raw FASTQ files have been deposited in the Gene Expression Omnibus (GEO) under GSE173709 accession. Other relevant data are available within the manuscript and its supplemental material files.


Articles from RNA Biology are provided here courtesy of Taylor & Francis

RESOURCES