Skip to main content
iScience logoLink to iScience
. 2021 Oct 16;24(11):103298. doi: 10.1016/j.isci.2021.103298

Prediction of RNA subcellular localization: Learning from heterogeneous data sources

Anca Flavia Savulescu 1,, Emmanuel Bouilhol 2,3, Nicolas Beaume 4, Macha Nikolski 2,3,∗∗
PMCID: PMC8571491  PMID: 34765919

Summary

RNA subcellular localization has recently emerged as a widespread phenomenon, which may apply to the majority of RNAs. The two main sources of data for characterization of RNA localization are sequence features and microscopy images, such as obtained from single-molecule fluorescent in situ hybridization-based techniques. Although such imaging data are ideal for characterization of RNA distribution, these techniques remain costly, time-consuming, and technically challenging. Given these limitations, imaging data exist only for a limited number of RNAs. We argue that the field of RNA localization would greatly benefit from complementary techniques able to characterize location of RNA. Here we discuss the importance of RNA localization and the current methodology in the field, followed by an introduction on prediction of location of molecules. We then suggest a machine learning approach based on the integration between imaging localization data and sequence-based data to assist in characterization of RNA localization on a transcriptome level.

Subject areas: Cell biology, Transcriptomics, Machine learning

Graphical Abstract

graphic file with name fx1.jpg


Cell biology; Transcriptomics; Machine learning

The importance of RNA subcellular localization

Gene expression in eukaryotes is regulated at various stages of the life cycle of RNA molecules, including transcription initiation, RNA processing, stability, subcellular localization, and translation into protein in the case of mRNAs, stages which are often linked. Subcellular localization of RNA transcripts leads to restriction of translation in a spatial and temporal manner (Bashirullah et al., 1998; Besse and Ephrussi, 2008; Jansen, 2001; Kloc et al., 2002; Martin and Ephrussi, 2009; Mardakheh et al., 2015; Zappulo et al., 2017; Savulescu et al., 2021), as well as serving to avoid toxicity of protein products and to corroborate rapid cellular responses (Shahbabian and Chartrand, 2012). Early studies in the field of RNA subcellular localization investigated the subcellular location of one and up to a few mRNA transcripts in developmental models, such as the Drosophila embryo and Xenopus oocyte, determining morphogen gradients and cellular fates (Macdonald and Struhl, 1998; Tautz and Pfeifle, 1989), as well as polarized cells such as migrating fibroblasts, budding yeast, and neuronal cells (Mili et al., 2008; Martin and Ephrussi 2009; Batish et al., 2012; Buxbaum et al., 2014; Tzingounis and Nicoll, 2006; Yasuda et al., 2017). However, recent years have seen an increasing number of studies in various organisms/models of RNA subcellular localization, resulting in the consensus that RNA subcellular localization is indeed not limited to a handful of RNAs in a small number of systems but rather a widespread phenomenon, which may apply to the majority of cellular RNAs, including non-coding RNAs (Bouvrette et al., 2017; La Manno et al., 2018; Lécuyer et al., 2007; Moor et al., 2017; Sharp et al., 2011; Weis et al., 2013; Cabili et al., 2015; Zappulo et al., 2017).

The subcellular location of the RNA can influence which proteins will bind the transcript, contributing to the RNA's fate, such as directing it toward degradation, increasing/decreasing its rate of translation, and determining molecular interactions. Consequently, the spatial distribution of the RNA can potentially influence the cellular concentration and location of its protein product (Brangwynne et al., 2009; Katz et al., 2012, 2016; Moor et al., 2018; Savulescu et al., 2021), which can, in turn, affect the cell's function and capacity to interact with adjacent cells or react to various environmental cues. Similarly, association of transcripts with specific cellular structures (Hughes and Simmonds, 2019; Khong et al., 2017; Padrón et al., 2019; Savulescu et al., 2021; Suter, 2018; Wilbertz et al., 2019 and others) may indicate a functional correlation between the transcript in the cell where this association occurs and broad cellular processes in the same cells, such as polarization, differentiation, etc. Taken together, this indicates that in addition to expression levels of RNA transcripts, subcellular spatial distribution of these transcripts may contribute to cell state and type (for example, Moor et al., 2018). Using solely single-cell genomics approaches, which are standard in cell state characterization studies, spatial information of RNA transcripts might be lost. For example, two neighboring cells in a tissue, which possess similar concentrations of the same RNA transcripts, would be classified by single-cell genomics approaches as the same cell type/state; however, the RNAs in these cells may exhibit marked patterns in subcellular dispersion (Savulescu et al., 2020) (Figure 1, Panel B). This again emphasizes the significance of characterizing subcellular distribution alongside expression levels of RNA transcripts for a more thorough classification of cell subtype or/and state.

Figure 1.

Figure 1

Subcellular RNA localization and visualization

(A) Typical smFISH images. On the left, a raw image showing Arhgdia mRNA smFISH spots; on the right, a superimposed and denoised image containing the following stains: DAPI for DNA in blue, anti-tubulin antibody in green, Arhgdia mRNA smFISH spots in red. Organelles of interest, such as the microtubule-organizing center (MTOC) and cell contour are extrapolated from z-slices of the anti-tubulin stain. Scale bar 10 μm.

(B) A variety of subcellular spatial distribution features of the RNA can be observed and subsequently analyzed (RNA spots are depicted in red): specific enrichment of RNA in various subcellular locations, random versus clustered distribution of RNA and correlation of RNA with cellular markers, such as the MTOC or ER.

In addition to contributions in basic research, knowledge of RNA subcellular spatial localization can be beneficial in biomedical applications. For instance, Didiot et al. (2018) show that htt mRNA, which encodes the protein responsible for Huntington's disease, is located both in the nucleus and cytoplasm (50/50) in neurons, while being purely cytoplasmic in non-neuronal cells. This has a therapeutic impact as nuclear htt mRNA is more stable and more resistant to oligonucleotide therapy than the cytoplasmic htt. Consequently, new therapeutic strategies can be envisioned, such as modifying the mRNA encoded by the abnormal allele of huntingtin to redirect it to the cytoplasm where oligonucleotide therapy is more efficient. Taken together, given the functional relevance of subcellular spatial distribution of RNA in both basic and clinical contexts, tools that are able to efficiently detect subcellular localizations at a transcriptome-wide level are urgently needed.

Visualization of RNA using single-molecule fluorescent in situ hybridization (smFISH) or MS2-system-based techniques have been the gold standard method to study RNA subcellular localization. In recent years, smFISH-based techniques and downstream in silico procedures for localization of mRNAs have been developed to cope with a large number of RNA transcripts. However, these methods can be technically challenging, be expensive, and often require sophisticated equipment. Machine learning, which has been successfully applied in various biological fields, including prediction of subcellular locations of proteins and of mRNAs, may be a suitable complementary approach to predict the precise subcellular localization of a large number of RNA species in a variety of systems on a transcriptome-wide level. In the perspective below, we provide a brief overview of techniques to study subcellular RNA localization and then introduce how machine learning can be applied to the study of RNA subcellular localization and discuss how machine learning can be harnessed to predict precise RNA subcellular localization. Finally, we discuss potential synergies between experimental and computational approaches of subcellular localization.

Current methods to study subcellular localization of RNA

To date, the most popular approach to study subcellular localization of RNA is image based. A large number of studies make use of smFISH-based techniques followed by epifluorescent or confocal microscopy to visualize and quantify intracellular mRNAs (typical smFISH images in Figure 1, Panel A). Alternatively, an MS2 tagging system followed by live cell imaging is applied (for example, Bertrand et al., 1998; Hocine et al., 2013 and others). smFISH makes use of multiple single-stranded DNA oligonucleotides, each labeled with a single fluorophore, which tile a specific RNA target (Raj et al., 2008). The signal obtained from multiple single fluorophores is sufficiently bright to be seen as a spot on an epifluorescent or confocal microscope and can easily be quantified (Raj et al., 2008). In recent years, a variety of smFISH-based techniques to increase the capacity of the method have emerged. These include the use of multiple rounds of labeling and imaging of the same sample to label a large number of RNAs. osmFISH is one such smFISH-based technique, in which the cellular organization of the mouse somatosensory cortex was mapped by labeling 33 RNAs over the course of 13 rounds. The number of labeled RNAs can be further increased by applying FISH probe barcoding and sequential labeling approaches, such as multiplexed error-robust FISH (MERFISH) (Moffitt and Zhuang, 2016) and sequential FISH (seqFISH and seqFISH+) (Lubeck et al., 2014; Eng et al., 2019). These methods possess the capacity to label hundreds to 1000 RNAs, nearly approaching full-transcriptome imaging. Additionally, Wang et al. (2018), used in situ amplification of RNA-specific probe barcode regions, decoding by 3D sequencing within samples converted to a hydrogel matrix, in a method termed spatially resolved transcript amplicon readout mapping (STARmap), to target hundreds of genes in various tissues (Wang et al., 2018). Branched DNA (bDNA)-based techniques rely on signal amplification using a series of non-isotopic DNA probes hybridized in a sequential manner to detect and quantify RNA (Player et al., 2001; Wang et al., 2012; Battich et al., 2013).

Following imaging of smFISH samples, data analysis is carried out. The analysis contains several steps, including segmentation, image processing to reduce background noise, readout of barcodes associated to the different transcripts in the case of barcoding-based smFISH methods, as well as precise localization and quantification of RNA FISH spots and in some cases, association of detected spots with cellular landmarks. A wide range of computational methods exist to extract quantified data for RNAs from smFISH images, and a number of widely used microscopy image analysis software programs are available for mRNA spot detection, such as the ICY spot detector (de Chaumont et al., 2012), ImageJ spot detection, FISH-Quant (Mueller et al., 2013), its more recent version FISH-quant v2 (Imbert et al., 2021), and CellProfiler (McQuin et al., 2018), among others. Many tools also provide cell and nucleus segmentation to apply on nuclear and cellular marker stainings that are often acquired with smFISH images. The extracted vectorized features can take different forms, such as RNA spot coordinates, or statistical features such as RNA counts and/or densities per cell or in different compartments as well as their positioning relative to cellular landmarks. See Imbert et al. (2021); Samacoits et al., 2018 for examples of possible features as well as their representation in relation to cellular landmarks (Savulescu et al., 2021).

In addition to image-based approaches, subcellular localization of RNA can be studied using methods such as biochemical cellular fractionation or physical separation of cellular compartments, followed by RNA detection using RNA sequencing or microarrays (for example, Mili et al., 2008; Cajigas et al., 2012; Bigler et al., 2017). Although these image-based and biochemical approaches have provided high-resolution data regarding the subcellular localization of RNA in cells and tissues, they each have limitations both in the spatial resolution of the method, its throughput, as well as in the reliance on effort-heavy computational analysis of the generated images to detect mRNA localizations.

To summarize, given the increasing number of studies that characterize spatial and temporal subcellular localization patterns of RNAs, as well as the functional relevance of variation in spatial distribution of RNA molecules, there is an increasing need to develop appropriate tools and technologies to capture fine-grained variations in spatial distribution of RNAs. These tools would be required to be quantitative, analyze data on a single-cell level, and be accustomed for high-throughput data.

RNA localization prediction

The rise in studies emphasizing the importance of RNA spatial and temporal subcellular localization in biological processes calls for an increased access to such information. Although it is clear that wet-lab experiments are the most straightforward way to characterize subcellular localization of biomolecules, these experiments are typically time- and money consuming. Contrastingly, with sufficient data and powerful in silico methods, predictions of subcellular localization of biomolecules would be far less expensive and intrinsically high throughput. We argue that this makes place for computational prediction of RNA localizations.

Prediction of subcellular locations based on sequence

The task of predicting subcellular location of biomolecules is not new. In the previous decade, a significant number of methods have been developed to predict subcellular locations of proteins (Emanuelsson et al., 2007; Imai and Nakai, 2010; Wan and Mak, 2015). The methods are based on certain features extracted from protein sequences, such as specific motifs, and aim to predict a rough location in terms of cell regions/departments such as those defined by the UniProt controlled vocabulary for subcellular locations. Often these methods use the guilt by association approach, using the homology with proteins whose location has been experimentally confirmed, the main experimental methods to establish protein localization being mass spectrometry and immunofluorescence.

Some mRNA transcripts contain within their sequences distinct motifs, which on their own, or by forming distinct secondary structures, have been shown to be determinant of the RNA subcellular location (reviewed in Martin and Ephrussi, 2009; Shahbabian and Chartrand, 2012 and others). Hence, several methods for prediction of RNA subcellular location have started to be developed based on mRNA/cDNA sequence composition (Yan et al., 2019; Garg et al., 2020), as well as introducing the predicted secondary structure (Yan et al., 2016). Interestingly, both of these very recent approaches are leveraged by the machine learning technology: deep recurrent networks (convolutional neural network [CNN], long short-term memory network [LSTM], and attention layers) for the former and more classical support vector machine (SVM)-based methodology for the latter. As in the protein world, these methods require training data sets providing mRNA subcellular locations for each annotated human protein-coding gene, such as cytosol, nuclear, endoplasmic reticulum (ER), membrane, etc. However, in contrast to the protein world, where annotation of data that is essential for the development of supervised machine learning typically exists, the RNA data generally lacks annotation. This lack of annotation is reflected in the very recent introduction of such methods for the mRNA, as well as in the fact that there remain to be only a few methods. Consequently, their adoption by the wider scientific community remains to be seen.

Predicting the subcellular location of non-coding RNAs (ncRNA), such as long non-coding RNAs (lncRNAs), microRNAs (miRs), and others is generally a more difficult task than for mRNAs. This is due to several factors, including the complexity of intra-molecular organization that ncRNAs can exhibit (Yan et al., 2016), as well as their typical short lengths and fewer known localization-correlated motifs than mRNAs (Ross et al., 2021; Constanty and Shkumatava, 2021). However, some associations of sequence motifs in lncRNAs with their subcellular localization have been identified (Zhang et al., 2014) and databases containing information on cellular compartments to which the lncRNAs localize have been developed, including LncLocate (Mas-Ponte et al., 2017) and RNALocate (Zhang et al., 2016). Given these recent developments, several methods have been proposed to predict subcellular location of lncRNAs. For example, Su et al. (2018) have combined a PseKNC and SVM to predict subcellular location of lncRNAs to the ribosome and exosome, Gudenas and Wang (2018) have developed a deep neural network to predict whether lncRNAs are nuclear or cytosolic based on their sequence, and Cao et al., (2018) used k-mer and high-level abstraction features generated by unsupervised deep models to construct four classifiers and predict five subcellular localizations of lncRNAs. Similarly, a number of tools exist to predict the location of miRNAs, such as an approach based on gene ontology (GO)-based functional similarity (Yang et al., 2018) and an SVM-based predictor (Meher et al., 2020), both relying on the miRNA locations from the RNALocate database.

Prediction of subcellular location from microscopy data

More recently, a plethora of tools have been developed to predict the locations of molecules based on microscopy imaging data rather than sequence, in particular, in application to protein subcellular localization. For example, Newberg and Murphy (2008) have developed a framework for image-based protein subcellular location prediction and it has been successfully applied to the Human Protein Atlas database (Newberg and Murphy, 2008). This has paved the way for a whole set of work of microscopy-image-based subcellular localization prediction of proteins, which includes methods relying on k-NN classifiers, SVMs, artificial neural networks, decision trees, and deep CNNs (Xijie Lu and Moses, 2016; Jiang et al., 2019). Recently, a crowd citizen science effort has attracted participants to annotate the subcellular locations of proteins in images and resulted in a novel deep learning method based on transfer learning (Sullivan et al., 2018), capable of predicting distributions of proteins to major organelles. Not only the authors have achieved high accuracy of predictions but also they were able to construct a fully annotated data set (Sullivan et al., 2018). In the next section, we will discuss how sequence data could be combined with smFISH data to improve the predictive accuracy.

Perspective on predicting RNA subcellular localization from heterogeneous data

As discussed earlier, the two main sources of data for characterization of subcellular localization of RNA are RNA sequence features and smFISH images. In addition, as subcellular localization is considered as a means to restrict translation, among other functions, information regarding the subcellular localization of the encoded protein product, as well as cell-type-specific information may aid in subcellular localization characterization of RNA. However, in some cases, including embryonic development models, subcellular localization of proteins might not be correlated with the subcellular localization of the encoding mRNAs (Knaut et al., 2000; Little et al., 2015; Mardakheh et al., 2015) and, as such, might be misleading if considered as a sole parameter for training the model. Additional information regarding the cellular model should be considered to account for variations in subcellular distribution of an mRNA and its encoded protein.

Sequence features are easy to obtain; however, they do not provide straightforward information regarding subcellular localization. Moreover, current methods of prediction of molecules' location remain limited in their predictive power. For example, precision can widely vary, as reported in the study by Garg et al. (2020), where the area under the curve lies between 0.7 and 0.98 for different compartments. Importantly, no method with good performance for most cellular compartments is available. On the other hand, fluorescent microscopy imaging data are the gold standard to determine the RNA subcellular localization; however, it is typically costly, time-consuming, and complex to perform for all known RNAs. In this part we will discuss how imaging data may be combined with sequence features to accurately predict RNA subcellular location.

Beyond the improvement of both observation techniques and predictive algorithms, we would like to suggest that a synergy between the existing two approaches — sequence and image-based — may assist in deciphering RNA localization (Figure 2). To illustrate our argument, consider the following manual steps to be followed when characterizing the subcellular localization of an RNA transcript that has not yet been visualized using smFISH or MS2-system based techniques.

  • (1)

    The first step would be to extract specific features from the transcript sequence within the RNA, which would aid in characterization of its role and location. These include conserved motifs, regulatory sequences, domains that bind specific RNA binding proteins (RBPs), secondary structures, etc.

  • (2)

    Second, one would have to collect the information on the role and subcellular localization of the protein product in the case of mRNAs and the role of the RNA, if known, in the case of long ncRNAs, as well as additional cell-type- and model-specific localization data.

  • (3)

    Third, identify RNAs with similar sequences and collect the information from databases on RNA localization, protein localization, smFISH and IF images, etc, for these RNAs.

Figure 2.

Figure 2

An illustration on how deep learning could use smFISH data and RNA sequence features to build a model, followed by prediction of the subcellular localization of an RNA using the model

(A) Training: RNAs for which both smFISH images and sequence features are available are collected to form a training set. For sequences, the relevant features are extracted (A, top), followed by training network 1 with images and network 2 with sequence features (A, bottom). The output of these networks is used as input to train network 3, which makes the prediction (A, bottom). During the training process, errors are back propagated to improve all networks (A, bottom).

(B) Prediction: the localization of a new RNA, for which only sequence features are available, needs to be predicted. The sequence is processed by network 2, followed by network 3, which outputs the prediction.

We argue that prediction of the subcellular localization of the RNA could be made by integrating these collected heterogeneous data of the RNA (sequence-based features of the RNA, role, and subcellular localization of the protein product/RNA, etc) as well as imaging data collected from databases of RNAs (Figure 2). A machine learning approach (e.g., using multimodal learning) of this objective would be built of the following steps:

  • 1)

    Gather a database of RNAs for which both sequence features and imaging information are known (Figure 2, Panel A).

  • 2)

    Extract features from sequence and images as detailed in the “RNA localization prediction” section, and vectorize them.

  • 3)

    Train a model based on sequences, represented by their vector of feature. Figure 2 offers an example on how to train a model with heterogeneous data (Figure 2, Panel A).

  • 4)

    Use this model to predict localization of RNA of interest (see the following part for details) (Figure 2, Panel B).

The critical part of this process is the use of a model created from RNAs for which both sequences and imaging data are available, to predict the localization of RNAs for which only sequences are available. Recent developments in machine learning and deep learning (siamese networks, adversarial training) may help to address this challenge. For instance, a network from the output of two networks, each based on one data type, could be trained. Once trained, this “aggregate network” could be fed by one type of data and still retain its predicting power. These approaches would create models “boosted” by the availability of smFISH in the training phase but able to predict with sequence features only.

Limitations in the data that might challenge prediction

Prediction greatly depends on the data used to train the model. First, a large amount of data might be required to train a model properly, especially for methods such as deep learning or a combination of classifiers, which might be data greedy. Thus, increasing the number of experimentally observed locations would be helpful to train better models. Second, it is clear that our knowledge of RNA subcellular localization is not uniform for different RNA species— some classes of RNA, e.g., lncRNA and miRNA, are underrepresented in our knowledge map of RNA subcellular localization, potentially leading to bias in the models. Targeted experiments toward the “blind spots” of the RNA localization transcriptome map would help to correct these biases.

One of the avenues to identify the important missing data that has to be acquired is to look at the most distinguishing features, which are features in the model that contribute the most to the prediction. Examining the importance of different features in prediction can be helpful to decipher biological processes behind RNA subcellular localization; however, most importantly, by considering the distinguishing features, it is possible to design experiments to acquire missing data for the subcellular localization of some transcripts. These data in turn increase the impact of distinguishing features and improve prediction accuracy. Among features that are already known to have strong impact on prediction, we can cite k-mer composition, RNA-protein binding motifs, and other genomic sequence features as identified in the study by Gudenas and Wang (2018), who have built a model to classify lncRNAs into cytosolic or nuclear and performed the analysis of feature importance. In this work, the authors have found that the k-mer composition accounted for 90% of the decision. Because it is assumed that such distinguishing features are linked to the biological processes underlying the subcellular localization, guiding acquisition of new data by the principle of enriching the features that contribute to precision accuracy appears to be a promising avenue to increase the impact of machine learning approaches in the field.

An additional limitation concerns the difficulties in segmentation of specialized cells, such as neurons, and the related RNA quantification in relevant compartments. Still, a number of local morphological descriptors such as dendritic tree, radial extension, soma area, and branching complexity have been computed to date (Shefi et al., 2002). Additionally, invariant measures such as Hu's moments can be included (Bhaskar et al., 2019). Such morphological features can be used by a downstream machine learning pipeline. Segmentation, although not always perfect, can be often well performed, and although the annotation of neurons is still not yet fully automated, significant progress has been made on this topic with very promising results (Li et al., 2019; Lin and Zheng, 2019; Schubert et al., 2019). Normalization of RNA quantities in different compartments can then be done by quantization methods, such as DypFISH (Savulescu et al., 2021) and others. Naturally, if the identification of certain compartments is imperfect, the corresponding normalized RNA quantities will consequently be skewed, which will inevitably impact predictions that would use these data. In conclusion, any machine learning localization prediction method is only as good as the data that it are built upon.

Finally, regarding features that may involve two-dimensional (2D) or three-dimensional (3D) features such as zip codes, machine learning approaches have been shown to be able to integrate complex features such as 2D and 3D structures from sequences (Singh et al., 2019; Jumper et al., 2021; Sweeney et al., 2021), and thus, machine learning methods may be able to take the zip code elements into account, at least indirectly.

Addressing the challenge of variability in subcellular locations

One of the major limitations in our understanding of RNA localization is the variability and dynamics of the subcellular localization. In many cases, a given RNA may be addressed to different subcellular compartments, depending on the circumstance (cell type, cell state, environmental conditions, various treatments, as well as temporally). This aspect remains poorly understood for the majority of cellular RNAs. Additionally, a given RNA's subcellular localization patterns might not be clearly pronounced or their localized enrichment high enough for detection. This typically occurs in developmental systems, such as the early Drosophila embryo, where only up to 4% of a particular mRNA localizes to germ granules, while the remaining fraction disperses through the embryo (Bergsten and Gavis, 1999; Jambor et al., 2015; Trcek et al., 2015). The current practice is to consider only the localization with the highest probability to be the right one. Using highly efficient models, localizations with enough confidence could all be considered as correct, being the sign of a multi-localized RNA. The level of confidence could ultimately give an estimate of the tendency of this RNA to be addressed at different localizations.

Variability in subcellular location of a given RNA can also be addressed by performing smFISH on cells grown on microfabricated patterns (Savulescu et al., 2021). Micropatterning of cells reduces cell-to-cell variability and allows for a higher resolution, quantitative characterization of subcellular localization of RNAs (Savulescu et al., 2021). Comparison of the spatial localization of a given RNA in the same cell type on micropatterns under different conditions or in different micropatterned cell types should allow for a thorough characterization of variability in subcellular localization of the given RNAs in these conditions/cell types. This should, in turn, assist in accounting for variability when predicting subcellular localization of RNAs.

To summarize, variability of RNA location dependent on cell types/conditions/cellular compartments etc is important to account for and should be integrated as parameters in the model. Although current models are limited in this respect, we would like to suggest that with growth of available annotated data, prediction models should gradually improve.

Discussion

High-precision methods for determination and quantification of subcellular localization of RNAs, such as smFISH, are now well established; however, these techniques remain time-consuming and costly. To drive better understanding of cellular processes, there is a need for development of methods to cover the broad landscape of RNA subcellular localization, for a large range of RNAs and conditions, ideally at the whole-transcriptome level. We would like to advance the argument that observation and prediction of RNA subcellular localization are two complementary approaches that can be leveraged together. Although they remain largely disconnected, linking them has the potential to greatly increase knowledge in the field. As mentioned earlier, the quality of the prediction is directly linked both to the quantity and the quality of the available datasets. Thus, building robust models for the prediction of RNA location requires growth of the available and well-annotated data. This increase of relevant data can be driven both by the biological questions, as it is currently the case, the development of relevant data repositories, but also in a complementary fashion by the requirement to fill the gaps in the existing predictive features that are used to populate machine learning models.

As previously discussed, any bias in annotations and/or errors in the upstream analyses are inevitably propagated into predictions. A possible avenue to circumvent these biases would be a non-supervised machine learning approach that would make its own inferences about the structures it finds in the data instead of relying on its vectorized representation. However, unsupervised learning requires even more data than the supervised counterparts. An unsupervised approach would thus be an excellent way to circumvent annotation biases and errors and possibly provide a solution when the field is mature enough and more data are available.

We argue that cost, technical, and time considerations can be alleviated by designing robust predictive methods that take advantage of heterogeneous data, where RNA location prediction is based on both imaging and sequence data. Continued growth of available data sets containing both the data itself and its reliable annotations and covering the diversity of different RNA species in various contexts provides hope that the construction of robust models based on heterogeneous data— both imaging and sequences— for prediction of subcellular RNA localization is realistically feasible in the near future.

Acknowledgments

We thank Jean-Baptiste Sibarita and Dana M. Savulescu for fruitful discussions and critical reading of the manuscript.

Author contributions

Conceptualization, A.F.S.,M.N., N.B.; writing - original draft, A.F.S.,E.B.,M.N., N.B.; machine learning architecture for location prediction - A.F.S.,E.B.,M.N.; writing - reviewing and editing, A.F.S.,E.B.,M.N, N.B.

Declaration of interests

The authors declare no competing interests.

Contributor Information

Anca Flavia Savulescu, Email: ankasi100@gmail.com.

Macha Nikolski, Email: macha.nikolski@u-bordeaux.fr.

References

  1. Bashirullah A., Cooperstock R.L., Lipshitz H.D. RNA localization in development. Annu. Rev. Biochem. 1998;67:335–394. doi: 10.1146/annurev.biochem.67.1.335. [DOI] [PubMed] [Google Scholar]
  2. Bhaskar D., Lee D., Knútsdóttir H., Tan C., Zhang M.H., Dean P., Roskelley C., Edelstein-Keshet L. A methodology for morphological feature extraction and unsupervised cell classification. BioRxiv. 2019 doi: 10.1101/623793. [DOI] [Google Scholar]
  3. Battich N., Stoeger T., Pelkmans L. Image-based transcriptomics in thousands of single human cells at single-molecule resolution. Nat. Methods. 2013;10:1127–1133. doi: 10.1038/nmeth.2657. [DOI] [PubMed] [Google Scholar]
  4. Batish M., van den Bogaarda P., Kramer F.R., Tyagi S. Neuronal mRNAs travel singly into dendrites. Proc. Natl. Acad. Sci. U S A. 2012;109:4645–4650. doi: 10.1073/pnas.1111226109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bergsten S.E., Gavis E.R. Role for mRNA localization in translational activation but not spatial restriction of nanos RNA. Development. 1999;126:659–669. doi: 10.1242/dev.126.4.659. [DOI] [PubMed] [Google Scholar]
  6. Bertrand E., Chartrand P., Schaefer M., Shenoy S.M., Singer R.H., Long R.M. Localization of ASH1 mRNA particles in living yeast. Mol.Cell. 1998;2:437–445. doi: 10.1016/s1097-2765(00)80143-4. [DOI] [PubMed] [Google Scholar]
  7. Besse F., Ephrussi A. Translational control of localized mRNAs: restricting protein synthesis in space and time. Mol. Cell. Biol. 2008;9:971. doi: 10.1038/nrm2548. [DOI] [PubMed] [Google Scholar]
  8. Bigler R.L., Kamande J.W., Dumitru R., Niedringhaus M., Taylor A.M. Messenger RNAs localized to distal projections of human stem cell derived neurons. Sci. Rep. 2017;7:611. doi: 10.1038/s41598-017-00676-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Bouvrette L.P., Cody N.A.L., Bergalet J., Lefebvre F.A., Diot C., Wang X., Blanchette M., Lécuyer E. CeFra-seq reveals broad asymmetric mRNA and non-coding RNA distribution profiles in Drosophila and human cells. RNA. 2017 doi: 10.1261/rna.063172.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Brangwynne C.P., Eckmann C.R., Courson D.S., Rybarska A., Hoege C., Gharakhani J., Jülicher F., Hyman A.A. Germline P granules are liquid droplets that localize by controlled dissolution/condensation. Science. 2009;324:1729–1732. doi: 10.1126/science.1172046. [DOI] [PubMed] [Google Scholar]
  11. Buxbaum A., Wu B., Singer R.H. Single β-actin mRNA detection in neurons reveals a mechanism for regulating its translatability. Science. 2014;343:419. doi: 10.1126/science.1242939. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Cabili M.N., Dunagin M.C., McClanahan P.D., Biaesch A., Padovan-Merhar O., Regev A., Raj A. Localization and abundance analysis of human lncRNAs at single-cell and single-molecule resolution. Genome Biol. 2015;16:20. doi: 10.1186/s13059-015-0586-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Cajigas I.J., Tushev G., Will T.J., tom Dieck, Fuerst N., Schuman E.M.S. The local transcriptome in the synaptic neuropil revealed by deep sequencing and high-resolution imaging. Neuron. 2012;74:453–466. doi: 10.1016/j.neuron.2012.02.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Cao Z., Pan X., Yang Y., Huang Y., Shen H.B. The lncLocator: a subcellular localization predictor for long non-coding RNAs based on a stacked ensemble classifier. Bioinformatics. 2018;34:2185–2194. doi: 10.1093/bioinformatics/bty085. [DOI] [PubMed] [Google Scholar]
  15. Constanty F., Shkumatava A. lncRNAs in development and differentiation: from sequence motifs to functional characterization. Development. 2021;148:dev182741. doi: 10.1242/dev.182741. [DOI] [PubMed] [Google Scholar]
  16. de Chaumont F., Dallongeville S., Chenouard N., Hervé N., Pop S., Provoost T., Vannary M.-Y., Pankajakshan P., Lecomte T., Le Montagner Y. Icy: an open bioimage informatics platform for extended reproducible research. Nat. Methods. 2012;9:690–696. doi: 10.1038/nmeth.2075. [DOI] [PubMed] [Google Scholar]
  17. Didiot M.C., Ferguson C.M., Ly S., Coles A.H., Smith A.O., Bicknell A.A., Hall L.M., Sapp E., Echeverria D., Pai A.A. Nuclear localization of huntingtin mRNA is specific to cells of neuronal origin. Cell Rep. 2018;24:2553–2560.e5. doi: 10.1016/j.celrep.2018.07.106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Emanuelsson O., Brunak S., von Heijne G., Nielsen H. Locating proteins in the cell using TargetP, SignalP and related tools. Nat.Protoc. 2007;2:953–971. doi: 10.1038/nprot.2007.131. [DOI] [PubMed] [Google Scholar]
  19. Eng C.L., Lawson M., Zhu Q., Dries R., Koulena N., Takei Y., Yun J., Cronin C., Karp C., Yuan G.C., Cai L. Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH. Nature. 2019;568:235–239. doi: 10.1038/s41586-019-1049-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Garg A., Singhal N., Kumar R., Kumar M. mRNALoc: a novel machine-learning based in-silico tool to predict mRNA subcellular localization. Nucleic Acids Res. 2020;48:W239–W243. doi: 10.1093/nar/gkaa385. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Gudenas B.L., Wang L. Prediction of LncRNA subcellular localization with deep learning from sequence features. Sci. Rep. 2018;8:1 1–10. doi: 10.1038/s41598-018-34708-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Hocine S., Raymond P., Zenklusen D., Chao J.A., Singer R.H. Single-molecule analysis of gene expression using two-color RNA labeling in live yeast. Nat. Methods. 2013;10:119–121. doi: 10.1038/nmeth.2305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Hughes A.C., Simmonds A.J. Drosophila mRNA localization during later development: past, present, and future. Front. Genet. 2019;10:135. doi: 10.3389/fgene.2019.00135. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Imai K., Nakai K. Prediction of subcellular locations of proteins: where to proceed? Proteomics. 2010;10:3970–3983. doi: 10.1002/pmic.201000274. [DOI] [PubMed] [Google Scholar]
  25. Imbert A., Ouyang W., Safieddine A., Coleno E., Zimmer C., Bertrand E., Walter T., Mueller F. FISH-quant v2: a scalable and modular analysis tool for smFISH image analysis. BioRxiv. 2021 doi: 10.1101/2021.07.20.453024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Jambor H., Surendranath V., Kalinka A.T., Mejstrik P., Saalfeld S., Tomancak P. Systematic imaging reveals features and changing localization of mRNAs in Drosophila development. Elife. 2015;4:e05003. doi: 10.7554/eLife.05003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Jansen R.P. mRNA localization: message on the move. Nat. Rev. Mol. Cell. Biol. 2001;2:247–256. doi: 10.1038/35067016. [DOI] [PubMed] [Google Scholar]
  28. Jiang Z., Wang D., Wu P., Chen Y., Shang H., Wang L., Xie H. Predicting subcellular localization of multisite proteins using differently weighted multi-label k-nearest neighbors sets. Technol.Health Care. 2019;27(Suppl 1):185–193. doi: 10.3233/THC-199018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Katz Z.B., Wells A.L., Park H.Y., Wu B., Shenoy S.M., Singer R.H. β-Actin mRNA compartmentalization enhances focal adhesion stability and directs cell migration. Genes Dev. 2012;26:1885–1890. doi: 10.1101/gad.190413.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Jumper J., Evans R., Pritzel A., Green T., Figurnov M., Ronneberger O., Tunyasuvunakool K., Bates R., Žídek A., Potapenko A. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. doi: 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Katz Z.B., English B.P., Lionnet T., Yoon Y.J., Monnier N., Ovryn B., Bathe M., Singer R.H. Mapping translation 'hot-spots' in live cells by tracking single molecules of mRNA and ribosomes. Elife. 2016;5:e10415. doi: 10.7554/eLife.10415. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Khong A., Matheny T., Jain S., Mitchell S.F., Wheeler J.R., Parker R. The stress granule transcriptomereveals principles of mRNA accumulation in stress granules. Mol.Cell. 2017;68:808–820.e5. doi: 10.1016/j.molcel.2017.10.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Kloc M., Zearfoss N.R., Etkin L.D. Mechanisms of subcellular mRNA localization. Cell. 2002;108:533–544. doi: 10.1016/s0092-8674(02)00651-7. [DOI] [PubMed] [Google Scholar]
  34. Knaut H., Pelegri F., Bohmann K., Schwartz H., Nusslein-Volhards C. Zebrafishvasa RNA but not its protein is a component of the germ plasm and segregates asymmetrically before germlinespecification. J.CellBiol. 2000;149:875–888. doi: 10.1083/jcb.149.4.875. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. La Manno G., Soldatov R., Zeisel A., Braun E., Hochgerner H., Petukhov V., Lidschreiber K., Kastriti M.E., Lönnerberg P., Furlan A. RNA velocity of single cells. Nature. 2018;560:494–498. doi: 10.1038/s41586-018-0414-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Lécuyer E., Yoshida H., Parthasarathy N., Alm C., Babak T., Cerovina T., Hughes T.R., Tomancak P., Krause H.M. Global analysis of mRNA localization reveals a prominent role in organizing cellular architecture and function. Cell. 2007;131:174–187. doi: 10.1016/j.cell.2007.08.003. [DOI] [PubMed] [Google Scholar]
  37. Li R., Zhu M., Li J., Bienkowski M.S., Foster N.N., Xu H., Ard T., Bowman I., Zhou C., Veldman M.B. Precise segmentation of densely interweaving neuron clusters using G-Cut. Nat.Commun. 2019;10:1549. doi: 10.1038/s41467-019-09515-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Lin X., Zheng J. A neuronal morphology classification approach based on locally cumulative connected deep neural networks. Appl. Sci. 2019;9:3876. [Google Scholar]
  39. Little S.C., Sinsimer K.S., Lee J.J., Wieschaus E.F., Gavis E.R. Independent and coordinate trafficking of single Drosophila germ plasm mRNAs. Nat.CellBiol. 2015;17:558–568. doi: 10.1038/ncb3143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Lubeck E., Coskun A.F., Zhyentayev T., Ahmad M., Cai L. Single-cell in situ RNA profiling by sequential hybridization. Nat. Methods. 2014;11:360–361. doi: 10.1038/nmeth.2892. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Macdonald P.M., Struhl G. cis-acting sequences responsible for anterior localization of bicoid mRNA in Drosophila embryos. Nature. 1998;336:595–598. doi: 10.1038/336595a0. [DOI] [PubMed] [Google Scholar]
  42. Mardakheh F.K., Paul A., Kümper S., Sadok A., Paterson H., Mccarthy A., Yuan Y., Marshall C.J. Global analysis of mRNA, translation, and protein localization: local translation is a key regulator of cell protrusions. Dev.Cell. 2015;35:344–357. doi: 10.1016/j.devcel.2015.10.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Martin K.C., Ephrussi A. mRNA localization: gene expression in the spatial dimension. Cell. 2009;136:719. doi: 10.1016/j.cell.2009.01.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Mas-Ponte D., Carlevaro-Fita J., Palumbo E., Pulido T.H., Guigo R., Johnson R. LncATLAS database for subcellular localization of long noncoding RNAs. RNA. 2017;23:1080–1087. doi: 10.1261/rna.060814.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. McQuin C., Goodman A., Chernyshev V., Kamentsky L., Cimini B.A., Karhohs K.W., Doan M., Ding L., Rafelski S.M., Thirstrup D. CellProfiler 3.0: Next-generation image processing for biology. PLoS Biol. 2018;16 doi: 10.1371/journal.pbio.2005970. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Meher P.K., Satpathy S., Rao A.R. miRNALoc: predicting miRNA subcellular localizations based on principal component scores of physico-chemical properties and pseudo compositions of di-nucleotides. Sci. Rep. 2020;10:14557. doi: 10.1038/s41598-020-71381-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Mili S., Moissoglu K., Macara I.G. Genome-wide screen reveals APC-associated RNAs enriched in cell protrusions. Nature. 2008;453:115. doi: 10.1038/nature06888. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Moffitt J.R., Zhuang X. RNA imaging with multiplexed error-robust fluorescence in situ hybridization (MERFISH) Methods Enzymol. 2016;572:1–49. doi: 10.1016/bs.mie.2016.03.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Moor A.E., Golan M., Massasa E.E., Lemze D., Weizman T., Shenhav R., Baydatch S., Mizrahi O., Winkler R., Golani O. Global mRNA polarization regulates translation efficiency in the intestinal epithelium. Science. 2017;357:1299–1303. doi: 10.1126/science.aan2399. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Moor A.E., Harnik Y., Ben-Moshe S., Massasa E.E., Rozenberg M., Eilam R., Bahar Halpern K., Itzkovitz S. Spatial reconstruction of single enterocytes uncovers broad zonation along the intestinal villus Axis. Cell. 2018;175:1156–1167.e15. doi: 10.1016/j.cell.2018.08.063. [DOI] [PubMed] [Google Scholar]
  51. Mueller F., Senecal A., Tantale K., Marie-Nelly H., Ly N., Collin O., Basyuk E., Bertrand E., Darzacq X., Zimmer C. FISH-quant: automatic counting of transcripts in 3D FISH images. Nat. Methods. 2013;10:277–278. doi: 10.1038/nmeth.2406. [DOI] [PubMed] [Google Scholar]
  52. Newberg J.Y., Murphy R.F. A framework for the automated analysis of subcellular patterns in human protein atlas images. J. Proteome Res. 2008;7:2300–2308. doi: 10.1021/pr7007626. [DOI] [PubMed] [Google Scholar]
  53. Padrón A., Iwasaki S., Ingolia N.T. Proximity RNA labeling by APEX-seqreveals the organization of translation initiation complexes and repressive RNA granules. Mol.Cell. 2019;75:875–887.e5. doi: 10.1016/j.molcel.2019.07.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Player A.N., Shen L.P., Kenny D., Antao V.P., Kolberg J.A. Single-copy gene detection using branched DNA (bDNA) in situ hybridization. J.Histochem.Cytochem. 2001;49:603–612. doi: 10.1177/002215540104900507. [DOI] [PubMed] [Google Scholar]
  55. Raj A., van den Bogaard P., Rifkin S.A., van Oudenaarden A., Tyagi S. Imaging individual mRNA molecules using multiple singly labeled probes. Nat. Methods. 2008;5:877–879. doi: 10.1038/nmeth.1253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Ross C.J., Rom A., Spinrad A., Gelbard-Solodkin D., Degani N., Ulitsky I. Uncovering deeply conserved motif combinations in rapidly evolving noncoding sequences. Genome Biol. 2021;22:29. doi: 10.1186/s13059-020-02247-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Samacoits A., Chouaib R., Safieddine A., Traboulsi A.-M., Ouyang W., Zimmer C., Peter M., Bertrand E., Walter T., Mueller F. A computational framework to study sub-cellular RNA localization. Nat.Commun. 2018;9:4584. doi: 10.1038/s41467-018-06868-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Savulescu A.F., Brackin R., Bouilhol E., Dartigues B., Warrell J.H., Pimentel M.R., Beaume N., Cortunato I.C., Dallongeville S., Boulle M. Interrogating RNA and protein spatial subcellular distribution in smFISH data with DypFISH. Cell Rep. Methods. 2021:100068. doi: 10.1016/j.crmeth.2021.100068. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Savulescu A.F., Jacobs C., Negishi Y., Davignon L., Mhlanga M.M. Pinpointing cell identity in time and space. Front. Mol. Biosci. 2020;7:209. doi: 10.3389/fmolb.2020.00209. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Shahbabian K., Chartrand P. Control of cytoplasmic mRNA localization. Cell. Mol. Life Sci. 2012;69:535–552. doi: 10.1007/s00018-011-0814-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Sharp J.A., Plant J.J., Ohsumi T.K., Borowsky M., Blower M.D. Functional analysis of the microtubule-interacting transcriptome. Mol. Biol. Cell. 2011;22:4312–4323. doi: 10.1091/mbc.E11-07-0629. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Schubert P.J., Dorkenwald S., Januszewski M., Jain V., Kornfeld J. Learning cellular morphology with neural networks. Nat.Commun. 2019;10:2736. doi: 10.1038/s41467-019-10836-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Shefi O., Golding I., Segev R., Ben-Jacob E., Ayali A. Morphological characterization of in vitro neuronal networks. Phys. Rev. E. 2002;66:021905. doi: 10.1103/PhysRevE.66.021905. [DOI] [PubMed] [Google Scholar]
  64. Singh J., Hanson J., Paliwal K., Zhou Y. RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning. Nat.Commun. 2019;10:5407. doi: 10.1038/s41467-019-13395-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Su Z.-D., Huang Y., Zhang Z.-Y., Zhao Y.-W., Wang D., Chen W., Chou K.-C., Lin H. iLoc-lncRNA: predict the subcellular location of lncRNAs by incorporating octamer composition into general PseKNC. Bioinformatics. 2018;34:4196–4204. doi: 10.1093/bioinformatics/bty508. [DOI] [PubMed] [Google Scholar]
  66. Sullivan D.P., Winsnes C.F., Åkesson L., Hjelmare M., Wiking M., Schutten R., Campbell L., Leifsson H., Rhodes S., Nordgren A. Deep learning is combined with massive-scale citizen science to improve large-scale image classification. Nat.Biotechnol. 2018;36:820. doi: 10.1038/nbt.4225. [DOI] [PubMed] [Google Scholar]
  67. Suter B. RNA localization and transport. Biochim.Biophys.Acta Gene Regul.Mech. 2018;1861:938–951. doi: 10.1016/j.bbagrm.2018.08.004. [DOI] [PubMed] [Google Scholar]
  68. Sweeney B.A., Hoksza D., Nawrocki E.P., Ribas C.E., Madeira F., Cannone J.J., Gutell R., Maddala A., Meade C.D., Williams L.D. R2DT is a framework for predicting and visualising RNA secondary structure using templates. Nat.Commun. 2021;12:3494. doi: 10.1038/s41467-021-23555-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Tautz D., Pfeifle C. A non-radioactive in situ hybridization method for the localization of specific RNAs in Drosophilia embryos reveals translational control of the segmentation gene hunchback. Chromosoma. 1989;98:81–85. doi: 10.1007/BF00291041. [DOI] [PubMed] [Google Scholar]
  70. Trcek T., Grosch M., York A., Shroff H., Lionnet T., Lehmann R. Drosophila germ granules are structured and contain homotypic mRNA clusters. Nat. Commun. 2015;6:7962. doi: 10.1038/ncomms8962. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Tzingounis A.V., Nicoll R.A. Arc/Arg3.1: linking gene expression to synaptic plasticity and memory. Neuron. 2006;52:403. doi: 10.1016/j.neuron.2006.10.016. [DOI] [PubMed] [Google Scholar]
  72. Wan S., Mak M.-W. De Gruyter; 2015. Machine Learning for Protein Subcellular Localization Prediction. [DOI] [Google Scholar]
  73. Wang X., Allen W.E., Wright M.A., Sylwestrak E.L., Samusik N., Vesuna S., Evans K., Liu C., Ramakrishnan C., Liu J. Three-dimensional intact-tissue sequencing of single-cell transcriptional states. Science. 2018;361:eaat5691. doi: 10.1126/science.aat5691. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Wang F., Flanagan J., Su N., Wang L-C., Bui S., Nielson A., Wu X., Vo H-T., Ma X-J., Luo Y. RNAscope A novel in situ RNA analysis platform for formalin-fixed, paraffin-embedded tissues. J.Mol.Diagn. 2012;14:22–29. doi: 10.1016/j.jmoldx.2011.08.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Weis B.L., Schleiff E., Zerges W. Protein targeting to subcellular organelles via mRNA localization. Biochim.Biophys.Acta. 2013;1833:260–273. doi: 10.1016/j.bbamcr.2012.04.004. [DOI] [PubMed] [Google Scholar]
  76. Wilbertz J.H., Voigt F., Horvathova I., Roth G., Zhan Y., Chao J.A. Single-molecule imaging of mRNA localization and regulation during the integrated stress response. Mol.Cell. 2019;73:946–958.e7. doi: 10.1016/j.molcel.2018.12.006. [DOI] [PubMed] [Google Scholar]
  77. Xijie Lu A., Moses A.M. An unsupervised kNNmethod to systematically detect changes in protein localization in high-throughput microscopy images. PLoS One. 2016;11:e0158712. doi: 10.1371/journal.pone.0158712. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Yan K., Arfat Y., Li D., Zhao F., Chen Z., Yin C., Sun Y., Hu L., Yang T., Qian A. Structure prediction: new insights into decrypting long noncoding RNAs. Int. J. Mol. Sci. 2016;17:132. doi: 10.3390/ijms17010132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Yan Z., Lécuyer E., Blanchette M. Prediction of mRNA subcellular localization using deep recurrent neural networks. Bioinformatics. 2019;35:i333–i342. doi: 10.1093/bioinformatics/btz337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Yang Y., Fu X., Qu W., Xiao Y., Shen H.B. MiRGOFS: a GO-based functional similarity measurement for miRNAs, with applications to the prediction of miRNA subcellular localization and miRNA–disease association. Bioinformatics. 2018;34:3547–3556. doi: 10.1093/bioinformatics/bty343. [DOI] [PubMed] [Google Scholar]
  81. Yasuda Y., Clatterbuck-Soper S.F., Jackrel M.E., Shorter J., Mili S. FUS inclusions disrupt RNA localization by sequestering kinesin-1 and inhibiting microtubule detyrosination. J.CellBiol. 2017;216:1015–1034. doi: 10.1083/jcb.201608022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Zhang B., Gunawardane L., Niazi F., Jahanbani F., Chen X., Valadkhan S. A novel RNA motif mediates the strict nuclear localization of a long noncoding RNA. Mol. Cell. Biol. 2014;34:2318–2329. doi: 10.1128/MCB.01673-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Zhang T., Tan P., Wang L., Jin N., Li Y., Zhang L., Yang H., Hu Z., Zhang L., Hu C. Rnalocate: a resource for RNA subcellular localizations. Nucleic Acids Res. 2016;45:D135–D138. doi: 10.1093/nar/gkw728. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Zappulo A., van den Bruck D., CiolliMattioli C., Franke V., Imami K., McShane E., Moreno-Estelles M., Calviello L., Filipchyk A., Peguero-Sanchez E. RNA localization is a key determinant of neurite-enriched proteome. Nat.Commun. 2017;8:583. doi: 10.1038/s41467-017-00690-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from iScience are provided here courtesy of Elsevier

RESOURCES