Skip to main content
The Scientific World Journal logoLink to The Scientific World Journal
. 2012 Dec 23;2012:541786. doi: 10.1100/2012/541786

Molecular Mechanisms and Function Prediction of Long Noncoding RNA

Handong Ma 1, Yun Hao 1, Xinran Dong 1, Qingtian Gong 1, Jingqi Chen 1, Jifeng Zhang 1, Weidong Tian 1,*
PMCID: PMC3540756  PMID: 23319885

Abstract

The central dogma of gene expression considers RNA as the carrier of genetic information from DNA to protein. However, it has become more and more clear that RNA plays more important roles than simply being the information carrier. Recently, whole genome transcriptomic analyses have identified large numbers of dynamically expressed long noncoding RNAs (lncRNAs), many of which are involved in a variety of biological functions. Even so, the functions and molecular mechanisms of most lncRNAs still remain elusive. Therefore, it is necessary to develop computational methods to predict the function of lncRNAs in order to accelerate the study of lncRNAs. Here, we review the recent progress in the identification of lncRNAs, the molecular functions and mechanisms of lncRNAs, and the computational methods for predicting the function of lncRNAs.

1. Introduction

Proteins and related protein-coding genes have been the main subject of biological studies for years. However, with the development of RNA sequencing technology and computational methods for assembling the transcriptome, it has become clear that besides protein-coding genes much of the mammalian genome is transcribed, and many noncoding RNA (ncRNA) transcripts tend to play important roles in a variety of biological processes. Understanding the function of ncRNAs has become one of the most important goals of modern biological studies [13]. ncRNAs can be classified into several distinct subclasses, including processed small RNAs [4], promoter-associated RNAs [5], and functional long noncoding RNAs (lncRNAs) [6]. The term of lncRNA was introduced to distinguish the special class of ncRNA from well-known small regulatory RNAs (i.e. miRNAs and siRNAs). lncRNAs are generally longer than 200 nucleotides [3, 7, 8]. Recent studies have shown that lncRNAs may act as important cis- or trans-regulators in various biological processes. Mutations in lncRNAs are related with a wide range of diseases, especially cancers and neurodegenerative diseases. Even so, the functions and molecular mechanisms of most lncRNAs are unknown. Though several computational methods have been developed to predict the functions of lncRNAs, it still remains a challenging task, partly owing to the lack of conservation in both the sequence and secondary structures of lncRNAs [911]. In this paper, we will summarize the recent progresses and challenges in the identification, molecular mechanism, and function prediction of lncRNAs.

2. Definition and Classification of lncRNA

The definition of lncRNA is based on two criteria, the size and the lack of protein-coding potential. In this paper, lncRNA refers to nonprotein-coding RNA longer than 200 nt [7, 1012], which distinguishes it them from mRNA and small regulatory RNA in a relatively satisfying way [11, 13]. Depending on their relationships with the nearest protein-coding genes, lncRNAs can be classified in three different ways [12, 14, 15]: (1) sense or antisense: lncRNAs that are located on the same strand or the opposite strand of the nearest protein-coding genes [16]; (2) divergent or convergent: lncRNAs that are transcribed in the divergent or convergent orientation compared to that of the nearest protein-coding genes [12]; (3) intronic or intergenic: lncRNAs that locate inside the introns of a protein-coding gene, or in the interval regions between two protein-coding genes [12, 17].

3. Identification of lncRNA

To identify lncRNAs, the first step is to obtain all transcripts including ncRNAs and mRNAs in cells, and then to distinguish lncRNAs from mRNAs and other types of ncRNAs. Traditional technologies, such as microarray, focus on the identification of protein-coding RNA transcripts. New technologies, such as RNA-Seq, are not limited to the identification of protein-coding RNA transcripts, and have led to the discovery of many novel ncRNA transcripts. The discrimination between lncRNAs and other small regulatory ncRNAs depends on their length. However, the length information alone is not enough to separate lncRNAs from mRNAs, and other criteria are needed for this purpose. Below, we will first briefly introduce new technologies in identifying RNA transcripts, especially ncRNA transcripts. Then, we will review current methods to distinguish lncRNAs from mRNAs.

3.1. Experimental Methods in Identifying lncRNA

Microarray —

Traditional microarray technologies use predefined probes to determine the expression level of mRNA transcripts and are not appropriate to identify lncRNAs. However, it has been found that a few previously defined mRNAs or some probe sequences actually are lncRNAs; thus, former microarray datasets can be reannotated to study the expression of lncRNAs [60]. With more and more lncRNAs discovered, new probes specific for lncRNAs can be designed. For example, Babak et al. designed probes from conserved intergenic and intragenic region to identify potential ncRNA transcripts [61]. However, microarray is not sensitive enough to detect RNA transcripts with low-expression level. Thus the use of microarray to identify lncRNAs is limited due to the low expression level of many lncRNAs.

SAGE and EST —

SAGE (serial analysis of gene expression) technology produces large numbers of short sequence tags and is capable of identifying both known and unknown transcripts. SAGE has been used and proved to be an efficient approach in studying lncRNAs. For example, Gibb et al. compiled 272 human SAGE libraries. By passing over 24 million tags they were able to generate lncRNA expression profiles in human normal and cancer tissues [62]. Lee et al. also used SAGE to identify potential lncRNA candidates in male germ cell [63]. However, SAGE is much more expensive than microarray, therefore is not widely employed in large-scale studies. EST (expressed sequence tag) is a short subsequence of cDNA, and is generated from one-shot sequencing of cDNA clone. The public database now contains over 72.6 million EST (GeneBank 2011), making it possible to discover novel transcripts. For example, Furuno et al. clustered EST to find functional and novel lncRNAs in mammalian [64]. Huang et al. used the public bovine-specific EST database to reconstruct transcript assemblies, and find transcripts in intergenic regions that are likely putative lncRNAs [65].

RNA-Seq —

With the development of next generation sequencing (NGS) technologies, RNA-Seq (also named whole transcriptome shotgun sequencing) has been widely used for novel transcripts discovery and gene expression analysis. Compared to traditional microarray technology, RNA-Seq has many advantages in studying gene expression. It is more sensitive in detecting less-abundant transcripts, and identifying novel alternative splicing isoforms and novel ncRNA transcripts. The basic workflow for lncRNA identification using RNA-Seq is shown in Figure 1. RNA-Seq is currently the most widely used technology in identifying lncRNAs. For example, Li et al. applied RNA-Seq to identify lncRNAs during chicken muscle development [66]. Nam and Bartel integrated RNA-Seq, poly (A)-site, and ribosome mapping information to obtain lncRNAs in C. elegans [16]. Pauli et al. performed RNA-Seq experiments at eight stages during zebrafish early development, and identified 1133 noncoding multiexonic transcripts [67]. Prensner et al. used RNA-Seq to study lncRNA in human prostate cancer from 102 prostate tissues and cell lines, and concluded that lncRNAs may be used for cancer subtype classification [68].

Figure 1.

Figure 1

Workflow of lncRNA identification from RNA-Seq.

RNA-IP —

RNA-IP (RNA-immunoprecipitation) is a new method developed to identify lncRNA that interacts with specific protein. Antibodies of the protein are first used to isolate lncRNA-protein complexes. Then, cDNA library is constructed followed by deep sequencing of interacting lncRNAs. Using RNA-IP, Zhao et al. discovered a 1.6-kb lncRNA within Xist that interacts with PRC2 [69].

Chromatin Signature-Based Approach —

The above-mentioned methods target on RNA transcripts directly. In contrast, chromatin signature-based approach uses chromatin signatures, such as H3K4me3 (the marker of active promoters) and H3K36me3 (the marker of transcribed region), to study actively transcribed genes including lncRNAs. In this approach, ChIP-Seq is used to generate genome-wide profiles of chromatin signatures [70], and the transcribed regions are mapped in the genome, where lncRNAs are determined and studied. For example, Guttman et al. identified 1,600 large multiexonic lncRNAs that are regulated by key transcription factors such as p53 and NFkB [71]. The advantage of this approach is its directness in investigating the mechanisms that regulate lncRNA expression.

3.2. Computational Methods in Identifying lncRNA

ORF Length Strategy —

Unlike protein-coding genes, the start codons and termination codons in lncRNAs tend to distribute randomly. As a result, the ORF length of lncRNAs can hardly extend to over 100 from a probabilistic point of view. Based on this principle, one way to discriminate lncRNAs from mRNAs is by ORF length. For example, the FANTOM project used a maximum ORF length cutoff of 100 codons to differentiate noncoding RNAs from mRNAs [72]. However, some lncRNAs are known to have ORFs longer than 100 codons, while some protein coding genes have fewer than 100 amino acids, such as RCI2A gene in Arabidopsis which encodes a protein of 54 amino acids [73]. Thus, this approach may cause misclassification. To overcome the drawbacks of methods based on ORF length, Jia et al. utilize a comparative genomics method to refine ncRNA candidates. They defined the RNA sequences as ncRNAs only if the cDNAs have no homologous proteins longer than 30 amino acids across the mammalian genomes [7]. However, this method relies largely on the completeness of the databases. Therefore, deficiency in protein coding annotation may cause misclassification of lncRNAs as well.

Sequence and Secondary Structure Conservation Strategy —

Compared to protein coding genes, noncoding genes are generally less conservative, meaning they are more inclined to mutate [21, 67]. Thus, measuring the coding potential is considered a way of identifying lncRNAs. Codon Substitution Frequency (CSF) is one of the criteria. For example, Guttman et al. used the maximum CSF score to assess the coding potential of a RNA sequence [71]. Clamp et al. and Lin et al. further combined CSF with reading frame conservation (RFC) to discriminate lncRNAs from mRNAs [74, 75]. Other similar methods include PhyloCSF use a phylogenetic framework to build two phylogenetic codon models that can distinguish coding from noncoding regions [76]. RNAcode combines amino acid substitution with gap patterns to assess the coding potential [77]. There are also methods that explore the conservation of RNA secondary structures to identify lncRNAs, including programs QRNA [78], RNAz [79], and EvoFOLD [80]. However, this approach is limited by lack of common conserved secondary structures specific for lncRNAs.

Machine Learning Strategies —

Owing to the complex identities of lncRNAs, recently an increasing number of machine learning-based methods have been developed to integrate various sources of data to distinguish lncRNAs from mRNAs. Table 1 summarizes the machine learning methods and the features used to train the model for identifying lncRNAs. For instance, CONC utilizes a series of protein features such as amino acid composition, secondary structure, and peptide length, to train a SVM model that distinguishes lncRNAs from mRNAs [18]. CPC (Coding Potential Calculator) also uses SVM for modeling and extracting sequence features and the comparative genomics features to assess the coding potential of transcripts [19, 20]. Lu et al. developed a machine learning method that integrates GC content, DNA conservation, and expression information to predict lncRNAs in C. elegans [21].

Although the above-described methods have shown their effectiveness in identifying lncRNAs, exceptional cases still remain. For instance, whether an RNA transcript is translated or not may be changeable during the course of evolution. As an example, Xist, a well-known lncRNA, evolves from a protein-coding gene [81]. Besides, some genes are bifunctional, and both the coding and noncoding isoforms exist. The steroid receptor RNA activator (SRA) was characterized as a noncoding RNA previously but the coding product was detected later [82]. Such ambiguity will be clarified when more about lncRNAs are known.

Table 1.

Machine-learning methods for identifying lncRNAs.

Method Features Algorithm References
Peptide length
Amino acid composition
Hydrophobicity
CONC Secondary structure content SVM [18]
Percentage of residues exposed to solvent
Sequence compositional entropy
Number of homologs obtained by PSI-BLAST
Alignment entropy

ORF prediction quality
CPC Number of homologs obtained by BLASTX SVM [19, 20]
Alignment quality
Segment distribution

Lu et al. RNA-seq experiments
Tilling arrays
poly-A + RNA-seq experiments
poly-A + tilling arrays
GC content
DNA conservation
Predicted protein sequence conservation
Predicted secondary structure free energy
Predicted secondary structure conservation
Naïve Bayes
Bayes Net
Decision Tree
Random Forest
Logistic Regression
SVM
[21]

4. lncRNA Function

lncRNAs have once been thought as the “dark matter” of the genome, because of our limited knowledge about their functions [83]. With more studies about lncRNAs conducted, it has become clear that lncRNAs have many specific functional features, and are likely to be involved in many diverse biological processes in cells. Rather than “dark matter,” they may act as necessary functional parts in the genome. These functional features include but are not limited to (i) lncRNAs have conserved splice junctions and introns [84]; (ii) the expression patterns of lncRNAs are tissue- and cell-specific [12, 67]; (iii) the altered expression of lncRNAs can be found in neurodegeneration, cancer, and other diseases [9, 10]; (iv) lncRNAs are associated with particular chromatin signatures that are indicative of actively transcribed genes [11, 85]. Below, we will briefly summarize the cellular functions of lncRNAs and molecular mechanisms of their functions.

4.1. Cellular Functions of lncRNA

With thousands of lncRNAs identified in mammals and other vertebrates [16], a few lncRNAs have been extensively studied, which have shed light on their possible functions. Firstly, lncRNAs are involved in various epigenetic regulations through recruitment of chromatin remodeling complexes to specific genomic loci, such as Xist, Air, and Kcnq1ot1 [22, 43]. Secondly, lncRNAs can regulate gene expression by interacting with protein partners in biological processes like protein synthesis, imprinting (Kcnq1ot1, Air), cell cycle control (TERRA), alternative splicing (MALAT1), and chromatin structure regulation (DNMT3b, PANDA) [9, 10, 38, 71, 8589]. Thirdly, lncRNAs are involved in enhancer-regulating gene activation (eRNAs), in which cases they may interact directly with distal genomic regions [90]. Fourthly, some lncRNAs serve as interacting partners or precursors for short regulatory ncRNAs [91]. For example, microRNAs (miRNAs) can be generated through sequential cleavage of lncRNAs, while Piwi-interacting RNAs (piRNAs) can be produced by processing a single lncRNA transcript [88].

Recent studies have shown the expression of lncRNA is tissue specific. Loewer et al. studied the expression of lncRNA in global remodeling of the epigenome and during reprogramming of somatic cells to induce pluripotent stem cells (iPSCs). They found some lncRNAs have cell-type specific expression pattern [26, 92]. Loss-of-function studies on most intergenic lncRNAs expressed in mouse embryonic stem (ES) cells revealed that knockdown of intergenic lncRNAs has major consequences on gene expression patterns, which are comparable to the effects of knockdown of well-known ES cell regulators [93]. This indicated that lncRNAs might play important roles in regulating developmental process. The ENCODE project analyzed the tissue-specific expression of lncRNAs in 31 cell types, and found that many lncRNAs have brain-specific expression pattern [9, 12]. There are increasing lines of evidences that link dysregulations of lncRNAs to diverse human diseases ranging from neuron diseases to cancer [9, 10], suggesting that the involvement of lncRNAs in human diseases can be far more prevalent than previously thought [94].

4.2. Molecular Mechanisms of lncRNA

The precise mechanism of how lncRNAs function still remains largely unknown. Currently, there are several hypothesis about it, including (1) RNA:DNA:DNA triplex (trans-); (2) RNA:DNA hybrid; (3) RNA:RNA hybrid of lncRNA with a nascent transcript; (4) RNA-protein interaction (cis-/trans-). Although only (1), (2), and (4) have been experimentally demonstrated so far [14], it is generally thought that lncRNAs may function through the interaction with its partners, such as DNA, RNA, or protein, and serve the following roles: signal, decoy, scaffold, and guide [11, 14]. Table 2 lists lncRNAs that use different mechanisms when carrying out their functions. Below, we give examples for the above-mentioned mechanisms.

Table 2.

Function classification of lncRNAs.

Archetype lncRNA name Length Target Function cis-/trans- References
Signal KCNQ1ot1, Air, Xist 91 kb, 108 kb, ~17 kb G9a, PRC, YY1 Transcriptional silencing of multiple genes; X inactivation (XCI) cis- [11, 14, 22, 23]
HOTAIR, Frigidair, HOTTIP, 2.2 kb, N.A., 3.7 kb LSD1-CoREST Signals of anatomic position, trans- [6, 11, 14]
lincRNA-p21, PANDA 3 kb; 1.5 kb hnRNP-K p53 targets in response to DNA damage trans- [14, 24, 25]
lincRNA-RoR 2.6 kb Oct4, Sox2, Nanog Pluripotency-associated N.A.b [11, 26]
COOLAIR, COLDAIR Multiple spliced: 400 bp/750 bp; ~1.1 kb FLC, PRC2 Combinatorial transcriptional regulation N.A. [27, 28]
eRNA Various sizes MLL-WDR5, TFsa Promotes mRNA synthesis cis- [29, 30]
Gas5 ~7 kb Glucocorticoid receptor Represses the glucocorticoid receptor N.A. [31]
1/2-sbsRNAs N.A.c SMD Formation of STAU1 binding sites N.A. [32]

Decoys DHFR-Minor 7.3, 5.0, 1.4, and 0.8 kb TFIIB Inhibits assembly of the preinitiation complex N.A. [33]
TERRA Various sizes Telomerase Regulation and protection of chromosome ends N.A. [34]
PANDA 1.5 kb NF-YA Inhibits expression of apoptotic genes trans- [35]
PTENP1 ~3.9 kb PTEN Sequestration of miRNAs N.A. [36, 37]
MALAT1 ~7 kb SR splicing factors Alters pattern of alternative splicing N.A. [38, 39]

Guides Xist ~17 kb PRC2, YY1 Inactives X chromosome cis- [14, 4042]
Air, COLDAIR 108 kb, G9a, PRC2 Silences transcription, affects histone acetylation and methylation states cis- [28, 43, 44]
HOTTIP ~3.8 kb MLL-WDR5 Chromosomal looping, chromatin modifications cis-
(looping)
[11, 45]
HOTAIR 2.2 kb LSD1-CoREST Alters and regulates epigenetic states trans- [14, 46, 47]
Jpx Multiple isoforms polycomb complexa Activation of Xist RNA on the inactive X trans- [11, 48]
lincRNA-p21 3 kb hnRNP-Ka p53 targets in response to DNA damage trans- [11, 24]

Scaffold TERC Various sizes TERT Telomerase catalytic activity trans- [49, 50]
HOTAIR 2.2 kb PRC2, LSD1, CoREST, REST Demethylates histone H3 on K4 to antagonize gene activation trans- [46, 51]
ANRIL Multiple spliced: 3.9 kb/34.8 kb PRC1, PRC2 Contributes to the functions of both PRC1 and PRC2 proteins trans- [52, 53]
Alpha Satellite Repeat LncRNA N.A. SUMO-HP1 Molecular scaffold for the targeting and local accumulation of HP1 N.A. [11, 54]

aNot yet understood.

bNot clearly referred as cis-action.

cNo length data available in all six databases listed in Table 3.

Signal —

Some lncRNAs have been reported to respond to diverse stimuli, hinting they may act as molecular signals [12, 24, 25, 27, 35]. For example, lncRNAs can act as markers for imprinting (Air and Kcnq1ot1), X inactivation (Xist), and silencing (COOLAIR). ChIP-Seq studies showed that the gene-activating enhancers produce lncRNA transcripts (eRNAs) [29, 95], and their expression level positively correlates with that of nearby genes, indicating a possible role in regulating mRNA synthesis. This is supported by a recent Loss-of-Function study that found the knockdown of 7 out of 12 lncRNAs affects expression of their cognate neighboring genes [8].

Decoy —

lncRNA can function as molecular decoy to negatively regulate an effector. Gas5 contains a hairpin sequence motif that resembles the DNA-binding site of the glucocorticoid receptor [31]. It can serve as a decoy to release the receptor from DNA to prevent transcription of metabolic genes [14]. Another example is the telomeric repeat-containing RNA (TERRA). It interacts with the telomerase protein through a repeat sequence complementary to the template sequence of telomerase RNA [11, 34].

Guide —

Upon interaction with the target molecular, lncRNA may have the ability to guide it into the proper position either in cis (on neighboring genes) or in trans (on distantly located genes). The newly found eRNAs appear to exert their effects in cis by binding to specific enhancers and actively engaged in regulating mRNA synthesis [11, 29]. HOTAIR and HOTTIP are transcribed within the human HOX clusters, and serve as signals of anatomic positions by expressing in cells that have distal and posterior positional identities; they both require the interacting partners to be properly localized to the site of action [6]. In this process, chromosomal looping of the 5′ end of HOXA brings HOTTIP into the spatial proximity of multiple HOXA genes, enforcing the maintenance of H3K4me3 and gene activation [14]. This long-range gene activation mechanism suggests that chromosome looping plays a central role in delivering lncRNA to its site of action [11, 45].

Scaffold —

Recent studies found that several lncRNAs have the capacity to bind more than two protein partners, where the lncRNAs serve as adaptors to form the functional protein complexes. The telomerase RNA TERC (TERRA) is a classic example of RNA scaffold, and is essential for telomerase function. HOTAIR binds the polycomb complex PRC2 to exert its “signal” function. A recent study found that the 3,700 nt of HOTAIR also interact with a second complex consisting of LSD1, CoREST, and REST to antagonize gene activation, further emphasizing its important role as the scaffold of the functional complex [11, 51].

Cis- and Trans-Action of lncRNAs —

lncRNAs can be classified as cis- or trans-regulators depending on whether it exerts its function on a neighboring gene on the same allele from which it is transcribed [96]. It was considered that many lncRNAs act as cis-regulators, as the expression of lncRNA is significantly correlated with their neighboring protein-coding genes [97, 98]. However, recent studies have questioned that the positive correlation between lncRNAs and their neighboring genes may be due to shared upstream regulation (such as, lincRNA-p21 [24] and lincRNA-Sox2 [6]), positional correlation (such as, HOTAIR [6]), transcriptional “ripple effects” [98], and indirect regulation of neighboring genes, instead of the effects of cis-regulation. This was supported by the fact that knock down of different number of lncRNAs had little effect on the expression of neighboring genes [96]. In general, it has been accepted that some lncRNAs are cis-regulators [99, 100], while the vast majority may function as trans-regulators [6, 11, 93]. Recently, some cis-regulating lncRNAs were found to have the capacity to act in trans [33, 101, 102], highlighting the complexity of lncRNAs.

Although substantial research progresses have been made since the discovery of lncRNAs, it still remains a challenge to understand the functions of lncRNAs. One reason is, unlike protein-coding genes whose mutations may result in severely obvious phenotypes, mutations in lncRNAs often do not cause significant phenotypes [85]. It is likely that lncRNAs may function at specific stage of development process or under specific conditions, and thus condition-specific studies of lncRNAs' phenotypes may be necessary. With more omics data about lncRNAs accumulating, computational prediction of the function of lncRNAs can help to design experiments to accelerate the understanding of lncRNAs.

5. lncRNA Database

The current lncRNA databases are summarized in Table 3. lncRNAdb is an integrated database specific for lncRNAs, including annotation, sequence, structural, species, and function categories of lncRNAs [55]. NONCODE is a database about ncRNAs that have been experimentally confirmed. It covers almost all published 73,272 lncRNAs in human and mouse; it also includes expression profiles of lncRNAs and their potential functions predicted from Coding-Noncoding coexpression network (see below) [56]. LNCipedia is another integrated lncRNA database, which includes 21,488 annotated human lncRNAs. It contains lncRNAs information about the coding potential, secondary structure, and microRNA binding sites [57]. fRNAdb and NRED are databases for ncRNAs including lncRNAs [58, 59]. The above databases provide great convenience for further analysis and applications of lncRNAs.

Table 3.

List of lncRNA databases.

Tools Source Description Reference
lncRNAdb http://www.lncrnadb.org/ Contain comprehensive list of lncRNAs in eukaryotes, and mRNAs with regulatory roles [55]
NONCODE http://noncode.org/ Integrative annotation of noncoding RNA (73,372 lncRNAs) [56]
LNCipedia http://www.lncipedia.org/ 21 488 annotated human lncRNA transcripts with secondary structure information, protein coding potential, and microRNA binding sites [57]
fRNAdb http://www.ncrna.org/frnadb/ A large collection of noncoding transcripts including annotated/unannotated sequences from H-inv database, NONCODE, and RNAdb [58]
NRED http://jsm-research.imb.uq.edu.au/nred/cgi-bin/ncrnadb.pl/ Noncoding RNA Expression Database [59]

6. Function Prediction of lncRNA

Computational prediction of lncRNA functions is still at its early development stage. Unlike protein-coding genes whose sequence motifs are indicative of their function, lncRNA sequences are usually not conserved and do not contain conserved sequence motifs [103, 104]. The secondary structures of lncRNA are also not conserved [105]. Thus, it is difficult to infer the function of lncRNAs based on their sequences or secondary structures alone. Since current knowledge suggests that lncRNAs function by regulating or interacting with its partner molecular, current methods focus on exploring the relationships between lncRNAs and protein-coding genes or miRNAs. Below, we will describe several current approaches for predicting the functions of lncRNAs.

6.1. Comparative Genomics Approach

Although most lncRNAs are not conserved, there are lncRNAs that are conserved across species, indicating their essential functions. Amit et al. identified 78 lncRNAs transcripts conserved in both human and mouse, and found 70 are either located within or close (<1000 nt distance) to a coding gene that is also conserved in the two genomes [106]. They assumed these lncRNAs might have close functional relationships with the nearby coding genes. However, this approach is limited because of the poor conservation of lncRNAs and cannot be applied at genome scale.

6.2. Coexpression with Coding Genes Approach

Many studied lncRNAs play important regulatory roles, and it is likely that lncRNAs regulating a specific biological process may be coexpressed with the genes involved in the same process. Thus, identifying coding genes that are coexpressed with lncRNAs may help to infer the function of lncRNAs. Based on this assumption, Guttman et al. developed a coexpression based method to predict lncRNAs functions at genome scale [71]. For each lncRNA, they ranked coding genes based on their coexpression level with the lncRNAs, and then performed a Gene Set Enrichment Analysis (GSEA) for the top-ranked genes to identify enriched functional terms corresponding to the lncRNAs. Out of 150 lncRNAs subjected for experimental validation, 85 exhibited the predicted functions, proving the effectiveness of using the coexpressed coding genes to infer the function of lncRNAs from their coexpressed coding genes. According to their predictions, lncRNAs participate in a rather wide range of biological processes such as cell proliferation, development, and immune surveillance. Andrea et al. employed a similar approach to predict the function of lncRNAs during zebrafish embryogenesis [67].

Liao et al. furthered the coexpression idea by constructing a coding-noncoding (CNC) gene coexpression network [107]. In contrast to the GSEA method that collects coding genes coexpressed for each lncRNA, the CNC method considers not only the coexpression between lncRNAs and coding genes, but also within lncRNAs group and coding gene group. When predicting the function of lncRNAs, the CNC method employs two different approaches: the hub-based and the network-module-based. In the hub-based approach, functions are assigned to each lncRNA according to the functional enrichment of its neighboring genes. In the network-module-based approach, Markov cluster algorithm (MCL) is used to identify coexpressed functional module in the CNC network; then functions of the module are transferred to the lncRNAs inside the module. Liao et al. applied the CNC method to annotate the functions of 340 mouse lncRNAs, and found these lncRNAs function mainly in organ or tissue development, cellular transport, and metabolic processes.

6.3. Interaction with miRNAs and Proteins Approach

Recent analysis found that lncRNAs share a synergism with miRNA in the regulatory network [108, 109]. It is likely that some lncRNAs function by binding miRNA. Therefore, identifying well-established miRNAs that bind lncRNAs may help to infer the function of lncRNAs. Jeggari et al. developed an algorithm named miRcode that predicts putative microRNA binding sites in lncRNAs using criteria such as seed complementarity and evolutionary conservation [110]. Jalali et al. constructed a genome-wide network of validated RNA mediated interactions, and uncovered previously unknown mediatory roles of lncRNA between miRNA and mRNA (Saakshi Jalali, arXiv preprint). Besides the interaction with miRNA, the interaction of lncRNAs with proteins can also be explored to predict their functions. Bellucci et al. developed a method called “catRAPID” that correlates lncRNAs with proteins by evaluating their interaction potential using physicochemical characteristics, including secondary structure, hydrogen bonding, van der Waals, and so forth [111]. However, unlike the coexpression based approach, the above two approaches were successful in only a number of lncRNAs, partly because the mechanism of how lncRNAs interact with miRNAs and proteins still remains unclear.

6.4. Challenges

Computational prediction of lncRNA functions is still at its primary stage. As the sequence and secondary structure of lncRNAs are generally not conserved, function prediction of lncRNAs mainly relies on their relationships with other moleculars, such as protein coding genes, miRNAs, and proteins. However, the molecular mechanism of how lncRNA function by interacting with other molecular remains largely unknown, making it difficult to develop computational methods to precisely predict the functions of lncRNAs. On the other hand, there are currently only a small number of lncRNAs whose functions are well understood, which makes it difficult to validate and optimize computational algorithms for predicting lncRNA functions. Finally, unlike protein-coding genes that have systematic functional annotation systems, there lacks an annotation system for lncRNA functions, making it difficult to evaluate computational algorithms for function prediction. Nevertheless, the success of predicting lncRNAs using the coexpression based approach has shown promises. With more functional genomics data about lncRNAs available in the near future, more powerful and accurate methods will be developed to help decipher the functions of lncRNAs.

7. Perspectives

It has been widely accepted that lncRNAs play important functional roles in cell, though the molecular mechanism of how lncRNAs function remains to be unraveled. In this paper, we have described several currently proposed models about the molecular mechanism of lncRNA functions. One commonality about these models is that lncRNAs function through the interaction with other molecular, including DNA, RNA, and proteins. Given the abundance of lncRNAs in genome, it is likely that the interaction between lncRNAs and other moleculars may be specific. This thus raises the possibility of developing novel methods to target certain lncRNA for gene-specific regulation. However, phenotypic studies of lncRNAs suggested that knockdown of many lncRNAs does not result in obvious phenotypes, making it difficult to understand their functions. Computational prediction of lncRNAs can provide hypothesis about the functions of lncRNAs, and help to design experiments to test them under specific conditions. Yet, it remains a significant challenge to develop effective methods to accurately infer the lncRNA functions, owing to the lack of detailed information about the molecular mechanisms of lncRNAs. In order to develop powerful computational methods, more studies about the derivation of lncRNAs, the molecular mechanism of lncRNAs and tissue-specific, or development-specific expression about lncRNAs are necessary.

Acknowledgment

This work was supported by the National Natural Science Foundation of China (Grant no. 31071113).

References

  • 1.Carninci P, Kasukawa T, Katayama S, et al. The transcriptional landscape of the mammalian genome. Science. 2005;309:1559–1563. doi: 10.1126/science.1112014. [DOI] [PubMed] [Google Scholar]
  • 2.Birney E, Stamatoyannopoulos JA, Dutta A, et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007;447:799–816. doi: 10.1038/nature05874. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Kapranov P, Cheng J, Dike S, et al. RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science. 2007;316(5830):1484–1488. doi: 10.1126/science.1138341. [DOI] [PubMed] [Google Scholar]
  • 4.Wilusz JE, Freier SM, Spector DL. 3′ end processing of a long nuclear-retained noncoding RNA yields a tRNA-like cytoplasmic RNA. Cell. 2008;135(5):919–932. doi: 10.1016/j.cell.2008.10.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Seila AC, Calabrese JM, Levine SS, et al. Divergent transcription from active promoters. Science. 2008;322(5909):1849–1851. doi: 10.1126/science.1162253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Rinn JL, Kertesz M, Wang JK, et al. Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell. 2007;129(7):1311–1323. doi: 10.1016/j.cell.2007.05.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Jia H, Osak M, Bogu GK, Stanton LW, Johnson R, Lipovich L. Genome-wide computational identification and manual annotation of human long noncoding RNA genes. RNA. 2010;16(8):1478–1487. doi: 10.1261/rna.1951310. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Ørom UA, Derrien T, Beringer M, et al. Long noncoding RNAs with enhancer-like function in human cells. Cell. 2010;143(1):46–58. doi: 10.1016/j.cell.2010.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Qureshi IA, Mattick JS, Mehler MF. Long non-coding RNAs in nervous system function and disease. Brain Research. 2010;1338(C):20–35. doi: 10.1016/j.brainres.2010.03.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Wapinski O, Chang HY. Long noncoding RNAs and human disease. Trends in Cell Biology. 2011;21(6):354–361. doi: 10.1016/j.tcb.2011.04.001. [DOI] [PubMed] [Google Scholar]
  • 11.Wang KC, Chang HY. Molecular mechanisms of long noncoding RNAs. Molecular Cell. 2011;43:904–914. doi: 10.1016/j.molcel.2011.08.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S, et al. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Research. 2012;22:1775–1789. doi: 10.1101/gr.132159.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Dinger ME, Pang KC, Mercer TR, Mattick JS. Differentiating protein-coding and noncoding RNA: challenges and ambiguities. PLoS Computational Biology. 2008;4(11) doi: 10.1371/journal.pcbi.1000176.e1000176 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Rinn JL, Chang HY. Genome regulation by long noncoding RNAs. Annual Review of Biochemistry. 2012;81:145–166. doi: 10.1146/annurev-biochem-051410-092902. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Ponting CP, Oliver PL, Reik W. Evolution and functions of long noncoding RNAs. Cell. 2009;136(4):629–641. doi: 10.1016/j.cell.2009.02.006. [DOI] [PubMed] [Google Scholar]
  • 16.Nam J-W, Bartel DP. Long noncoding RNAs in C. elegans . Genome Research. 2012;22(12):2529–2540. doi: 10.1101/gr.140475.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Tsai MC, Spitale RC, Chang HY. Long intergenic noncoding RNAs: new links in cancer progression. Cancer Research. 2011;71(1):3–7. doi: 10.1158/0008-5472.CAN-10-2483. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Liu J, Gough J, Rost B. Distinguishing protein-coding from non-coding RNAs through support vector machines. PLoS genetics. 2006;2(4, article no. e29) doi: 10.1371/journal.pgen.0020029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Altschul SF, Madden TL, Schäffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research. 1997;25(17):3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Kong L, Zhang Y, Ye ZQ, et al. CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Research. 2007;35:W345–W349. doi: 10.1093/nar/gkm391. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Lu ZJ, Yip KY, Wang G, et al. Prediction and characterization of noncoding RNAs in C. elegans by integrating conservation, secondary structure, and high-throughput sequencing and array data. Genome Research. 2011;21(5):276–285. doi: 10.1101/gr.110189.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Pandey RR, Mondal T, Mohammad F, et al. Kcnq1ot1 antisense noncoding RNA mediates lineage-specific transcriptional silencing through chromatin-level regulation. Molecular Cell. 2008;32(2):232–246. doi: 10.1016/j.molcel.2008.08.022. [DOI] [PubMed] [Google Scholar]
  • 23.Mohammad F, Mondal T, Kanduri C. Epigenetics of imprinted long noncoding RNAs. Epigenetics. 2009;4(5):277–286. [PubMed] [Google Scholar]
  • 24.Huarte M, Guttman M, Feldser D, et al. A large intergenic noncoding RNA induced by p53 mediates global gene repression in the p53 response. Cell. 2010;142(3):409–419. doi: 10.1016/j.cell.2010.06.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Hung T, Wang Y, Lin MF, et al. Extensive and coordinated transcription of noncoding RNAs within cell-cycle promoters. Nature Genetics. 2011;43(7):621–629. doi: 10.1038/ng.848. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Loewer S, Cabili MN, Guttman M, et al. Large intergenic non-coding RNA-RoR modulates reprogramming of human induced pluripotent stem cells. Nature Genetics. 2010;42(12):1113–1117. doi: 10.1038/ng.710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Swiezewski S, Liu F, Magusin A, Dean C. Cold-induced silencing by long antisense transcripts of an Arabidopsis Polycomb target. Nature. 2009;462(7274):799–802. doi: 10.1038/nature08618. [DOI] [PubMed] [Google Scholar]
  • 28.Heo JB, Sung S. Vernalization-mediated epigenetic silencing by a long intronic noncoding RNA. Science. 2011;331(6013):76–79. doi: 10.1126/science.1197349. [DOI] [PubMed] [Google Scholar]
  • 29.Kim TK, Hemberg M, Gray JM, et al. Widespread transcription at neuronal activity-regulated enhancers. Nature. 2010;465(7295):182–187. doi: 10.1038/nature09033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Wang D, Garcia-Bassets I, Benner C, et al. Reprogramming transcription by distinct classes of enhancers functionally defined by eRNA. Nature. 2011;474(7351):390–397. doi: 10.1038/nature10006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Kino T, Hurt DE, Ichijo T, Nader N, Chrousos GP. Noncoding RNA Gas5 is a growth arrest- and starvation-associated repressor of the glucocorticoid receptor. Science Signaling. 2010;3(107, article no. ra8) doi: 10.1126/scisignal.2000568. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Gong C, Maquat LE. LncRNAs transactivate STAU1-mediated mRNA decay by duplexing with 39 UTRs via Alu eleme. Nature. 2011;470(7333):284–288. doi: 10.1038/nature09701. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Martianov I, Ramadass A, Serra Barros A, Chow N, Akoulitchev A. Repression of the human dihydrofolate reductase gene by a non-coding interfering transcript. Nature. 2007;445(7128):666–670. doi: 10.1038/nature05519. [DOI] [PubMed] [Google Scholar]
  • 34.Redon S, Reichenbach P, Lingner J. The non-coding RNA TERRA is a natural ligand and direct inhibitor of human telomerase. Nucleic Acids Research. 2010;38(17):5797–5806. doi: 10.1093/nar/gkq296.gkq296 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Hung T, Chang HY. Long noncoding RNA in genome regulation: prospects and mechanisms. RNA Biology. 2010;7(5):582–585. doi: 10.4161/rna.7.5.13216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Poliseno L, Salmena L, Zhang J, Carver B, Haveman WJ, Pandolfi PP. A coding-independent function of gene and pseudogene mRNAs regulates tumour biology. Nature. 2010;465(7301):1033–1038. doi: 10.1038/nature09144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Song MS, Carracedo A, Salmena L, et al. Nuclear PTEN regulates the APC-CDH1 tumor-suppressive complex in a phosphatase-independent manner. Cell. 2011;144(2):187–199. doi: 10.1016/j.cell.2010.12.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Tripathi V, Ellis JD, Shen Z, et al. The nuclear-retained noncoding RNA MALAT1 regulates alternative splicing by modulating SR splicing factor phosphorylation. Molecular Cell. 2010;39(6):925–938. doi: 10.1016/j.molcel.2010.08.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Bernard D, Prasanth KV, Tripathi V, et al. A long nuclear-retained non-coding RNA regulates synaptogenesis by modulating gene expression. EMBO Journal. 2010;29(18):3082–3093. doi: 10.1038/emboj.2010.199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Plath K, Mlynarczyk-Evans S, Nusinow DA, Panning B. Xist RNA and the mechanism of X chromosome inactivation. Annual Review of Genetics. 2002;36:233–278. doi: 10.1146/annurev.genet.36.042902.092433. [DOI] [PubMed] [Google Scholar]
  • 41.Lee JT. The X as model for RNA’s niche in epigenomic regulation. Cold Spring Harbor Perspectives in Biology. 2010;2(9) doi: 10.1101/cshperspect.a003749.a003749 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Sun BK, Deaton AM, Lee JT. A transient heterochromatic state in Xist preempts X inactivation choice without RNA stabilization. Molecular Cell. 2006;21(5):617–628. doi: 10.1016/j.molcel.2006.01.028. [DOI] [PubMed] [Google Scholar]
  • 43.Nagano T, Mitchell JA, Sanz LA, et al. The Air noncoding RNA epigenetically silences transcription by targeting G9a to chromatin. Science. 2008;322(5908):1717–1720. doi: 10.1126/science.1163802. [DOI] [PubMed] [Google Scholar]
  • 44.Camblong J, Iglesias N, Fickentscher C, Dieppois G, Stutz F. Antisense RNA stabilization induces transcriptional gene silencing via histone seacetylation in S. cerevisiae . Cell. 2007;131(4):706–717. doi: 10.1016/j.cell.2007.09.014. [DOI] [PubMed] [Google Scholar]
  • 45.Wang KC, Yang YW, Liu B, et al. A long noncoding RNA maintains active chromatin to coordinate homeotic gene expression. Nature. 2011;472(7341):120–126. doi: 10.1038/nature09819. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Khalil AM, Guttman M, Huarte M, et al. Many human large intergenic noncoding RNAs associate with chromatin-modifying complexes and affect gene expression. Proceedings of the National Academy of Sciences of the United States of America. 2009;106(28):11667–11672. doi: 10.1073/pnas.0904715106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Zhao J, Ohsumi TK, Kung JT, et al. Genome-wide identification of polycomb-associated RNAs by RIP-seq. Molecular Cell. 2010;40(6):939–953. doi: 10.1016/j.molcel.2010.12.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Tian D, Sun S, Lee JT. The long noncoding RNA, Jpx, Is a molecular switch for X chromosome inactivation. Cell. 2010;143(3):390–403. doi: 10.1016/j.cell.2010.09.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Collins K. Physiological assembly and activity of human telomerase complexes. Mechanisms of Ageing and Development. 2008;129(1-2):91–98. doi: 10.1016/j.mad.2007.10.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Zappulla DC, Cech TR. Yeast telomerase RNA: a flexible scaffold for protein subunits. Proceedings of the National Academy of Sciences of the United States of America. 2004;101(27):10024–10029. doi: 10.1073/pnas.0403641101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Tsai MC, Manor O, Wan Y, et al. Long noncoding RNA as modular scaffold of histone modification complexes. Science. 2010;329(5992):689–693. doi: 10.1126/science.1192002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Kotake Y, Nakagawa T, Kitagawa K, et al. Long non-coding RNA ANRIL is required for the PRC2 recruitment to and silencing of p15INK4B tumor suppressor gene. Oncogene. 2011;30(16):1956–1962. doi: 10.1038/onc.2010.568. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Yap KL, Li S, Muñoz-Cabello AM, et al. Molecular interplay of the noncoding RNA ANRIL and methylated histone H3 lysine 27 by polycomb CBX7 in transcriptional silencing of INK4a . Molecular Cell. 2010;38(5):662–674. doi: 10.1016/j.molcel.2010.03.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Maison C, Bailly D, Roche D, et al. SUMOylation promotes de novo targeting of HP1alpha to pericentric heterochromatin. Nature Genetics. 2011;43(3):220–227. doi: 10.1038/ng.765. [DOI] [PubMed] [Google Scholar]
  • 55.Amaral PP, Clark MB, Gascoigne DK, Dinger ME, Mattick JS. LncRNAdb: a reference database for long noncoding RNAs. Nucleic Acids Research. 2011;39(1):D146–D151. doi: 10.1093/nar/gkq1138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Bu D, Yu K, Sun S, Xie C, Skogerbo G, et al. NONCODE v3. 0: integrative annotation of long noncoding RNAs. Nucleic Acids Research. 2012;40:D210–D215. doi: 10.1093/nar/gkr1175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Volders PJ, Helsens K, Wang X, Menten B, Martens L, et al. LNCipedia: a database for annotated human lncRNA transcript sequences and structures. doi: 10.1093/nar/gks915. Nucleic Acids Research. In press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Kin T, Yamada K, Terai G, et al. fRNAdb: a platform for mining/annotating functional RNA candidates from non-coding RNA sequences. Nucleic Acids Research. 2007;35(1):D145–D148. doi: 10.1093/nar/gkl837. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Dinger ME, Pang KC, Mercer TR, Crowe ML, Grimmond SM, Mattick JS. NRED: a database of long noncoding RNA expression. Nucleic Acids Research. 2009;37(1):D122–D126. doi: 10.1093/nar/gkn617. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Michelhaugh SK, Lipovich L, Blythe J, Jia H, Kapatos G, Bannon MJ. Mining Affymetrix microarray data for long non-coding RNAs: altered expression in the nucleus accumbens of heroin abusers. Journal of Neurochemistry. 2011;116(3):459–466. doi: 10.1111/j.1471-4159.2010.07126.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Babak T, Blencowe BJ, Hughes TR. A systematic search for new mammalian noncoding RNAs indicates little conserved intergenic transcription. BMC Genomics. 2005;6(article no. 14) doi: 10.1186/1471-2164-6-104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Gibb EA, Vucic EA, Enfield KS, Stewart GL, Lonergan KM, et al. Human cancer long non-coding RNA transcriptomes. PLoS One. 2011;6 doi: 10.1371/journal.pone.0025915.e25915 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Lee TL, Xiao A, Rennert OM. Identification of novel long noncoding RNA transcripts in male germ cells. Methods in Molecular Biology. 2012;825:105–114. doi: 10.1007/978-1-61779-436-0_9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Furuno M, Pang KC, Ninomiya N, et al. Clusters of internally primed transcripts reveal novel long noncoding RNAs. PLoS Genetics. 2006;2(4, article no. e37) doi: 10.1371/journal.pgen.0020037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Huang W, Long N, Khatib H. Genome-wide identification and initial characterization of bovine long non-coding RNAs from EST data. Animal Genetics. 2012;43:674–682. doi: 10.1111/j.1365-2052.2012.02325.x. [DOI] [PubMed] [Google Scholar]
  • 66.Li T, Wang S, Wu R, Zhou X, Zhu D, et al. Identification of long non-protein coding RNAs in chicken skeletal muscle using next generation sequencing. Genomics. 2012;99:292–298. doi: 10.1016/j.ygeno.2012.02.003. [DOI] [PubMed] [Google Scholar]
  • 67.Pauli A, Valen E, Lin MF, Garber M, Vastenhouw NL, et al. Systematic identification of long noncoding RNAs expressed during zebrafish embryogenesis. Genome Research. 2012;22:577–591. doi: 10.1101/gr.133009.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Prensner JR, Iyer MK, Balbin OA, et al. Transcriptome sequencing across a prostate cancer cohort identifies PCAT-1, an unannotated lincRNA implicated in disease progression. Nature Biotechnology. 2011;29(8):742–749. doi: 10.1038/nbt.1914. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Zhao J, Sun BK, Erwin JA, Song JJ, Lee JT. Polycomb proteins targeted by a short repeat RNA to the mouse X chromosome. Science. 2008;322(5902):750–756. doi: 10.1126/science.1163045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Park PJ. ChIP-seq: advantages and challenges of a maturing technology. Nature Reviews Genetics. 2009;10(10):669–680. doi: 10.1038/nrg2641. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Guttman M, Amit I, Garber M, et al. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature. 2009;458(7235):223–227. doi: 10.1038/nature07672. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Okazaki Y, Furuno M, Kasukawa T, et al. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature. 2002;420(6915):563–573. doi: 10.1038/nature01266. [DOI] [PubMed] [Google Scholar]
  • 73.Yang X, Tschaplinski TJ, Hurst GB, et al. Discovery and annotation of small proteins using genomics, proteomics, and computational approaches. Genome Research. 2011;21(4):634–641. doi: 10.1101/gr.109280.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Lin MF, Carlson JW, Crosby MA, et al. Revisiting the protein-coding gene catalog of Drosophila melanogaster using 12 fly genomes. Genome Research. 2007;17(12):1823–1836. doi: 10.1101/gr.6679507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Clamp M, Fry B, Kamal M, et al. Distinguishing protein-coding and noncoding genes in the human genome. Proceedings of the National Academy of Sciences of the United States of America. 2007;104(49):19428–19433. doi: 10.1073/pnas.0709013104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Lin MF, Jungreis I, Kellis M. PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions. Bioinformatics. 2011;27(13):i275–i282. doi: 10.1093/bioinformatics/btr209.btr209 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Washietl S, Findeiß S, Müller SA, et al. RNAcode: robust discrimination of coding and noncoding regions in comparative sequence data. RNA. 2011;17(4):578–594. doi: 10.1261/rna.2536111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Rivas E, Eddy SR. Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics. 2001;2, article no. 8 doi: 10.1186/1471-2105-2-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Washietl S, Hofacker IL, Stadler PF. Fast and reliable prediction of noncoding RNAs. Proceedings of the National Academy of Sciences of the United States of America. 2005;102(7):2454–2459. doi: 10.1073/pnas.0409169102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Pedersen JS, Bejerano G, Siepel A, et al. Identification and classification of conserved RNA secondary structures in the human genome. PLoS Computational Biology. 2006;2(4, article no. e33):251–262. doi: 10.1371/journal.pcbi.0020033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Duret L, Chureau C, Samain S, Weissanbach J, Avner P. The Xist RNA gene evolved in eutherians by pseudogenization of a protein-coding gene. Science. 2006;312(5780):1653–1655. doi: 10.1126/science.1126316. [DOI] [PubMed] [Google Scholar]
  • 82.Chooniedass-Kothari S, Emberley E, Hamedani MK, et al. The steroid receptor RNA activator is the first functional RNA encoding a protein. FEBS Letters. 2004;566(1-3):43–47. doi: 10.1016/j.febslet.2004.03.104. [DOI] [PubMed] [Google Scholar]
  • 83.Kim ED, Sung S. Long noncoding RNA: unveiling hidden layer of gene regulatory networks. Trends in Plant Science. 2012;17:16–21. doi: 10.1016/j.tplants.2011.10.008. [DOI] [PubMed] [Google Scholar]
  • 84.Hiller M, Findeiß S, Lein S, et al. Conserved introns reveal novel transcripts in Drosophila melanogaster . Genome Research. 2009;19(7):1289–1300. doi: 10.1101/gr.090050.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Mattick JS. The genetic signatures of noncoding RNAs. PLoS Genetics. 2009;5(4) doi: 10.1371/journal.pgen.1000459.e1000459 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Bernstein E, Allis CD. RNA meets chromatin. Genes and Development. 2005;19(14):1635–1655. doi: 10.1101/gad.1324305. [DOI] [PubMed] [Google Scholar]
  • 87.Whitehead J, Pandey GK, Kanduri C. Regulation of the mammalian epigenome by long noncoding RNAs. Biochimica et Biophysica Acta. 2009;1790(9):936–947. doi: 10.1016/j.bbagen.2008.10.007. [DOI] [PubMed] [Google Scholar]
  • 88.Wilusz JE, Sunwoo H, Spector DL. Long noncoding RNAs: functional surprises from the RNA world. Genes and Development. 2009;23(13):1494–1504. doi: 10.1101/gad.1800909. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Beltran M, Puig I, Peña C, et al. A natural antisense transcript regulates Zeb2/Sip1 gene expression during Snail1-induced epithelial-mesenchymal transition. Genes and Development. 2008;22(6):756–769. doi: 10.1101/gad.455708. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Ørom UA, Shiekhattar R. Noncoding RNAs and enhancers: complications of a long-distance relationship. Trends in Genetics. 2011;27:433–439. doi: 10.1016/j.tig.2011.06.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Mattick JS, Makunin IV. Small regulatory RNAs in mammals. Human Molecular Genetics. 2005;14(1):R121–R132. doi: 10.1093/hmg/ddi101. [DOI] [PubMed] [Google Scholar]
  • 92.Nagano T, Fraser P. No-nonsense functions for long noncoding RNAs. Cell. 2011;145(2):178–181. doi: 10.1016/j.cell.2011.03.014. [DOI] [PubMed] [Google Scholar]
  • 93.Guttman M, Donaghey J, Carey BW, Garber M, Grenier JK, et al. lincRNAs act in the circuitry controlling pluripotency and differentiation. Nature. 2011;477:295–300. doi: 10.1038/nature10398. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Johnson R. Long non-coding RNAs in Huntington's disease neurodegeneration. Neurobiology of Disease. 2012;46:245–254. doi: 10.1016/j.nbd.2011.12.006. [DOI] [PubMed] [Google Scholar]
  • 95.De Santa F, Barozzi I, Mietton F, et al. A large fraction of extragenic RNA Pol II transcription sites overlap enhancers. PLoS Biology. 2010;8(5) doi: 10.1371/journal.pbio.1000384.e1000384 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Li ZH, Rana TM. Molecular mechanisms of RNA-triggered gene silencing machineries. Accounts of Chemical Research. 2012;45:1122–1131. doi: 10.1021/ar200253u. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Ponjavic J, Oliver PL, Lunter G, Ponting CP. Genomic and transcriptional co-localization of protein-coding and long non-coding RNA pairs in the developing brain. PLoS Genetics. 2009;5(8) doi: 10.1371/journal.pgen.1000617.e1000617 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Ebisuya M, Yamamoto T, Nakajima M, Nishida E. Ripples from neighbouring transcription. Nature Cell Biology. 2008;10(9):1106–1113. doi: 10.1038/ncb1771. [DOI] [PubMed] [Google Scholar]
  • 99.Brown CJ, Ballabio A, Rupert JL, et al. A gene from the region of the human X inactivation centre is expressed exclusively from the inactive X chromosome. Nature. 1991;349(6304):38–44. doi: 10.1038/349038a0. [DOI] [PubMed] [Google Scholar]
  • 100.Sleutels F, Zwart R, Barlow DP. The non-coding Air RNA is required for silencing autosomal imprinted genes. Nature. 2002;415(6873):810–813. doi: 10.1038/415810a. [DOI] [PubMed] [Google Scholar]
  • 101.Lee JT. Lessons from X-chromosome inactivation: long ncRNA as guides and tethers to the epigenome. Genes and Development. 2009;23(16):1831–1842. doi: 10.1101/gad.1811209. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Schmitz KM, Mayer C, Postepska A, Grummt I. Interaction of noncoding RNA with the rDNA promoter mediates recruitment of DNMT3b and silencing of rRNA genes. Genes and Development. 2010;24(20):2264–2269. doi: 10.1101/gad.590910. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103.Willingham AT, Orth AP, Batalov S, et al. Molecular biology: a strategy for probing the function of noncoding RNAs finds a repressor of NFAT. Science. 2005;309(5740):1570–1573. doi: 10.1126/science.1115901. [DOI] [PubMed] [Google Scholar]
  • 104.Mercer TR, Dinger ME, Mattick JS. Long non-coding RNAs: insights into functions. Nature Reviews Genetics. 2009;10(3):155–159. doi: 10.1038/nrg2521. [DOI] [PubMed] [Google Scholar]
  • 105.Pang KC, Dinger ME, Mercer TR, et al. Genome-wide identification of long noncoding RNAs in CD8+ T cells. Journal of Immunology. 2009;182(12):7738–7748. doi: 10.4049/jimmunol.0900603. [DOI] [PubMed] [Google Scholar]
  • 106.Khachane AN, Harrison PM. Mining mammalian transcript data for functional long non-coding RNAs. PLoS One. 2010;5(4) doi: 10.1371/journal.pone.0010316.e10316 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.Liao Q, Liu C, Yuan X, et al. Large-scale prediction of long non-coding RNA functions in a coding-non-coding gene co-expression network. Nucleic Acids Research. 2011;39(9):3864–3878. doi: 10.1093/nar/gkq1348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108.Braconi C, Kogure T, Valeri N, et al. microRNA-29 can regulate expression of the long non-coding RNA gene MEG3 in hepatocellular cancer. Oncogene. 2011;30:4750–4756. doi: 10.1038/onc.2011.193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109.Ebert MS, Sharp PA. Emerging roles for natural microRNA sponges. Current Biology. 2010;20(19):R858–R861. doi: 10.1016/j.cub.2010.08.052. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110.Jeggari A, Marks DS, Larsson E. miRcode: a map of putative microRNA target sites in the long non-coding transcriptome. Bioinformatics. 2012;28:2062–2063. doi: 10.1093/bioinformatics/bts344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 111.Bellucci M, Agostini F, Masin M, Tartaglia GG. Predicting protein associations with long noncoding RNAs. Nature Methods. 2011;8(6):444–445. doi: 10.1038/nmeth.1611. [DOI] [PubMed] [Google Scholar]

Articles from The Scientific World Journal are provided here courtesy of Wiley

RESOURCES