SUMMARY
Long non-coding RNAs (lncRNAs) have recently emerged as key players in fundamental cellular processes and diseases, but their functions are poorly understood. HOTAIR is a 2,148-nucleotide-long lncRNA molecule involved in physiological epidermal development and in pathogenic cancer progression, where it has been demonstrated to repress tumor and metastasis suppressor genes. To gain insights into the molecular mechanisms of HOTAIR, we purified it in a stable and homogenous form in vitro and we determined its functional secondary structure through chemical probing and phylogenetic analysis. The HOTAIR structure reveals a degree of structural organization comparable to well-folded RNAs, like the group II intron, rRNA or lncRNA steroid receptor activator. It is composed of four independently-folding modules, two of which correspond to predicted protein-binding domains. Secondary structure elements that surround protein-binding motifs are evolutionarily conserved. Our work serves as a guide for “navigating” through the lncRNA HOTAIR and ultimately for understanding its function.
INTRODUCTION
Long non-coding RNAs (lncRNAs) are RNA molecules containing >200 nucleotides that possess little or no protein-coding capacity (Derrien et al., 2012). Approximately 30,000 lncRNAs are expressed in humans (Volders et al., 2013), which surpasses the total number of protein-coding genes (20,687, (Flicek et al., 2014)). At least 2,500 lncRNAs are conserved between different species (Necsulea et al., 2014), and many lncRNAs are expressed in large percentages of individuals within the same populations (Pennisi, 2014) and in many tissues within the same organism (Kaushik et al., 2013). A series of lncRNA knock-outs in mice caused severe, often lethal, effects on development (Sauvageau et al., 2013). These observations indicate that lncRNAs play central roles in cellular physiology, and not surprisingly many lncRNAs are associated with vital cellular processes and pathological states (Wapinski and Chang, 2011). However, the mechanism by which lncRNAs exert their molecular functions remains largely uncharacterized.
One of the best studied lncRNAs is HOTAIR (HOX transcript antisense intergenic RNA), which is a regulator of epidermal tissue development (Schorderet and Duboule, 2011) that is particularly abundant in peripheral tissues of the human body (Rinn et al., 2007). HOTAIR is a 2,148-nt-long, spliced and polyadenylated transcript encoded within the HoxC gene cluster on chromosome 12 and which acts in trans on the HoxD locus of chromosome 2. By interacting with chromatin remodeling enzymes (Rinn et al., 2007), HOTAIR silences the HoxD genes (Sparmann and van Lohuizen, 2006), including numerous tumor and metastasis suppressors, like HoxD10, PGR, and protocadherin (Gupta et al., 2010). Therefore, when overexpressed, HOTAIR promotes cell invasiveness, tumor development and metastasis (Gupta et al., 2010; Kim et al., 2013).
HOTAIR interactions with chromatin remodeling enzymes are still poorly characterized at the molecular level. The 5’-end of HOTAIR (HOTAIR 1–300) recruits polycomb group proteins (PcG), i.e. polycomb repressive complex 2 (PRC2) (Rinn et al., 2007). HOTAIR-PRC2 interactions form with nanomolar affinity (Cifuentes-Rojas et al., 2014; Davidovich et al., 2013) and are primarily mediated by an 89-mer fragment of HOTAIR (nucleotides 212–300) and by PRC2 subunits Eed and Ezh2 (Wu et al., 2013). By contrast, the 3’-end of HOTAIR binds the LSD1/CoREST/REST (RE1-silencing transcription factor) complex (Tsai et al., 2010). However, deeper insights into the molecular mechanism of HOTAIR lncRNA require more specific studies on the molecular properties of this RNA target.
In this work, we have determined the experimental secondary structural map of lncRNA HOTAIR. We employed a non-denaturing purification protocol to obtain large amounts of HOTAIR in a homogeneous and monodisperse form and we assessed the ionic requirements for HOTAIR folding by studying its compaction with biophysical hydrodynamic methods. Having obtained a uniform, co-transcriptionally folded sample, we then performed SHAPE, DMS and terbium structure probing in parallel with phylogenetic determination of functional secondary structural motifs in HOTAIR. Our results offer structural insights into the largest human lncRNA mapped to date, defining its modular architecture and providing a framework for understanding the functional properties of this important target
RESULTS
HOTAIR is transcribed as a homogeneous population of RNA molecules under non-denaturing conditions
Mapping the structure of HOTAIR in vitro presented unprecedented challenges because of the large size of this lncRNA (2,148 nucleotides). Primarily, we were confronted with the challenge of producing large amounts of HOTAIR at high purity and homogeneity.
We obtained a homogeneous RNA population by using a non-denaturing (or “native”) purification protocol (see methods) that preserves the secondary structure of HOTAIR formed during transcription (Batey, 2014; Toor et al., 2008). By contrast, other purification methods (see supplementary experimental procedures) that involve heat denaturation steps and refolding yielded inhomogeneous samples (Figures 1A and B). For instance, HOTAIR prepared by heat denaturation followed by snap-cooling on ice (Novikova et al., 2012) displays a size-exclusion chromatography (SEC) elution profile that is broader than native HOTAIR (Figure 1A). At least three different species of HOTAIR can be distinguished by sedimentation velocity analytical ultracentrifugation experiments (SV-AUC) indicating a tendency of the RNA to aggregate and misfold (Figure 1B). Remarkably, the species displaying the same sedimentation coefficient as native HOTAIR corresponds to less than 50% of the total preparation. Even more extreme is the behavior of HOTAIR prepared by heat denaturation and slow cooling to room temperature (Rinn et al., 2007). Slow-cooled HOTAIR forms large aggregates that elute in the void volume of the SEC column (Figure 1A). Additionally, despite removing these aggregates by SEC, slow-cooled HOTAIR is inhomogeneous and distributes into a multitude of equally represented species with different sedimentation coefficients (Figure 1B). This comparative analysis of folding methods indicates that non-denaturing purification technique, which produces monodisperse, homogenous RNA molecules, is therefore the method of choice for purifying HOTAIR, and perhaps other lncRNAs, in vitro.
Figure 1. Purification and folding of HOTAIR.
(A) Homogeneity of HOTAIR RNA evaluated by SEC. HOTAIR prepared via native purification (red) produces a homogeneous and monodisperse RNA sample. “Snapcool” (green) and “slowcool” (blue) denaturing protocols produce heterogeneous samples characterized by a broad distribution of elution volumes and by accumulation of aggregated material in the void volume. (B) Homogeneity of HOTAIR RNA determined by SV-AUC. HOTAIR RNA obtained through native method (red) sediments as a single homogenous species with a sedimentation coefficient of approximately 20S, whereas samples prepared by denaturation and refolding (green and blue) display a highly inhomogeneous distribution of particles. (C) SV-AUC profiles of HOTAIR obtained under native conditions in the presence of increasing concentrations of magnesium. The graph was obtained using SedFit (Brown and Schuck, 2006). (D) Hill plot of the hydrodynamic radii (Rh, in angstroms) derived from the SV-AUC experiment described in panel B (see also Figure S1).
HOTAIR folds into a compact structure upon addition of magnesium
Having established a purification protocol for HOTAIR, we conducted a series of SEC and SV-AUC experiments to analyze the molecular compaction of HOTAIR after adding increasing concentrations of magnesium. The results obtained by SEC show that homogeneity of the preparation was maintained over a broad range of Mg2+ concentrations (0–25 mM MgCl2) (Figure S1). Higher concentrations of magnesium (up to 50 mM) led to nonspecific magnesium-driven aggregation (Figure S1). Increasing the concentration of magnesium had two major effects on the HOTAIR SV-AUC profile (Figure 1C). It produced an increase in the sedimentation coefficient and a decrease in UV absorption, both being typical effects observed during RNA folding (Sosnick and Pan, 2003). We fit the SV-AUC profiles to a Hill equation and obtained values for K1/2Mg = 8.6 ± 0.8 mM and n = 1.1 ± 0.1 (Figure 1D). Interestingly, K1/2Mg is less than that of the ai5γ group IIB intron (Su et al., 2003).
Taken together, both SV-AUC and SEC experiments suggest that 25 mM Mg2+ (three times the experimentally determined K1/2Mg) is a concentration of magnesium at which HOTAIR is fully folded and monodisperse and thus represents the optimal experimental condition to perform structural studies.
HOTAIR secondary structure can be probed reproducibly by multiple methods
Having determined the ionic requirements for folding HOTAIR, we examined the HOTAIR secondary structure by chemical probing. First, we performed SHAPE (Selective 2’-Hydroxyl analyzed by Primer Extension), a technique that targets loop regions of the RNA structure and that has been extensively used for RNA secondary structure prediction (Deigan et al., 2009). We measured SHAPE reactivity at single nucleotide resolution and used the data to provide constraints for secondary structure prediction (Figure 2).
Figure 2.
Secondary structure of HOTAIR derived from SHAPE, DMS, and terbium chemical probing. SHAPE reactivities are depicted by colored nucleotides. DMS reactivities are represented by colored dots over the nucleotides. Terbium reactivities are represented by squares on the nucleotides. Highly reactive nucleotides are displayed in red and orange and low reactive nucleotides are displayed in black or blue according to the values reported in the legend. Watson-Crick and non-canonical base pairs are depicted by black and purple lines, respectively (also see Figures S2–S3, Table S1 and Table S3).
To validate the HOTAIR secondary structure map derived from the SHAPE data, we additionally performed DMS and terbium probing. DMS methylates adenosines and cytidines that are not base-paired, while terbium cleavage targets single-stranded regions in a non-sequence-specific manner, similar to in-line probing techniques (Sigel et al., Nature 2000). Both DMS and Terbium reactivity are in good agreement with SHAPE reactivity of HOTAIR (Figure 2, also see Figures S2–S3 and Table S1). Only 7.6% of nucleotides with high SHAPE reactivity displayed low reactivity to DMS and vice versa. Only 7.9% of nucleotides with high SHAPE reactivity displayed low reactivity when treated with terbium and vice versa.
HOTAIR forms independent structural modules
To interpret the structure map obtained by chemical probing, we used the 3S shotgun approach developed by the Sanbonmatsu lab (Novikova et al., 2013). In this approach, the full length lncRNA is mapped alongside sequential segments of the RNA to identify potential independently-folded sub-domains. If secondary structural elements identified within a given fragment match the probing profile of the full length RNA, then a subdomain is implicated.
In our work, we designed a total of 8 fragments (Figure 3). Fragments F1–F4 span the entire HOTAIR sequence without overlapping each other and without breaking any secondary structure elements. Fragments F5–F7 overlap with those of F1–F4 and disrupt helical regions identified in our structure map. Finally, fragment F8 is a composite of F1 and F2. As for full-length HOTAIR, SEC profiles revealed that all eight fragments are homogeneous and monodisperse at 25 mM Mg2+. Additionally, SV-AUC revealed that they all possess ionic requirements for folding that are very similar to full-length HOTAIR, with K1/2Mg values between 6 to 12 mM (Table S2). At increasing Mg2+ concentrations, the fragments become more compact (Table S2). For instance, the hydrodynamic radius (Rh) of F8 decreases by up to 22.4 % in the presence of magnesium and that of F4 by 19 %.
Figure 3. Shotgun fragment analysis reveals modularity in HOTAIR.
(A) Schematic representation of HOTAIR fragments in respect to their position along the sequence of full-length HOTAIR. (B) Normalized SHAPE reactivity of full-length HOTAIR. (C) Scatter plots comparing shape reactivity of each fragment with corresponding region in full-length HOTAIR. Pearson correlation values (rp) between the reactivity of each fragment and of full-length HOAIR are indicated (also see Table S2).
Based on these findings, we probed the structure of all fragments at 25 mM Mg2+, the same concentration used for full-length HOTAIR. In addition, we probed fragment 1–300 (F300), which is a HOTAIR fragment previously shown to bind protein complex PRC2 in vivo (Tsai et al., 2010). Comparing the SHAPE reactivity of the fragments with that of full-length HOTAIR shows high correlation, with p-values < 10−9 (Figure 3). Pearson’s correlation coefficient values (rp) are higher for fragments that do not break elements of secondary structure observed in the full-length molecule (F1–4, Figure 3). Among these fragments, rp is higher for nucleotides at the 5’-end of HOTAIR (i.e. F1, F2, F8, and F300 possess rp values of 0.96, 0.93, 0.96, and 0.90, respectively) and lower for nucleotides at the 3’-end of HOTAIR (i.e. F3 and F4 possess rp values of 0.71 and 0.78, respectively).
These results suggest that the overall HOTAIR structure can be divided into independent structural modules or “domains”, represented by fragments F1–F4. The majority of HOTAIR structural elements are formed by close-range base pairing among nucleotides within such fragments. Considering that overlapping fragments F5–F7 still maintain short duplexes in common with full-length HOTAIR, it is not unexpected that such fragments display chemical reactivity that resembles the full-length molecule.
HOTAIR possesses an intricate secondary structure
Overall, HOTAIR is highly structured, with more than 50% of the nucleotides being involved in base pairing. There are 56 helical segments, 38 terminal loops, 34 internal loops, and 19 junction regions (Figure 2), which is comparable to other highly structured RNAs (see supplementary Table S3). Supported by the fragment analysis described above, four independent domains can be identified (D1–D4, corresponding to fragments F1–F4, respectively). D1 (nucleotides 1–530) consists of 12 helices, 8 terminal loops, and 4 junctions (three 3-way junctions and one 4-way junction). D2 (nucleotides 531 – 1040) consists of 15 helices, 11 terminal loops and 4 junctions (three 5-way junctions and one 3-way junction). D3 (nucleotides 1041 – 1513) is the smallest of all the four domains and consists of 9 helices, 6 terminal loops and 3 junctions (two 4-way junctions and one 3-way junction). Finally, D4 (nucleotides 1514 – 2148) is the largest among the four domains and consists of 20 helices, 13 terminal loops and 7 junctions (one 6-way, two 4-way and four 3-way junctions).
In addition to high structural complexity, HOTAIR also displays a significant degree of covariation in key elements of the experimental secondary structure map (Figure 4A). Covariance analysis across 33 mammalian sequences allowed us to validate the secondary structure and to identify the key structural elements that could be involved in function (Figure 4 and Figure S4A–S4D). The overall covariance in HOTAIR is comparable to the lncRNA SRA (Novikova et al., 2012). A significant number of helices in HOTAIR possess covariant base-pairs and half-flips. Most strikingly, covariant elements localize in helices that surround proposed protein-binding segments of HOTAIR in D1 (Tsai et al., 2010). For instance, in H7 (nucleotides 187–216), 45% of individual base-pairs are conserved or covariant. Helix 7 is also conserved in mouse HOTAIR (Figure 4B). Interestingly, covariant helices are not only limited to predicted protein-binding regions, but they are present in all four domains of HOTAIR. For example, in H27 (nucleotides 949–1009, D2) and H31 (nucleotides 1251–1275, D3), 52 % and 57 % of base-pairs are covarying, respectively. Similarly, in H10 (nucleotides 327–394, D1) there are seven consistent half-flips, seven conserved, and three covariant base pairs out of 21 total base pairs. Helix 10 is also conserved in mouse HOTAIR (Figure 4C).
Figure 4. Sequence covariation in HOTAIR.
(A) Secondary structure map of HOTAIR color-coded by evolutionary covariation of each base-pair in 33 mammalian sequences. Covariant base-pairs are highlighted in green, consistent half-flips pairs are highlighted in blue, and conserved base-pairs are highlighted in red. (B) One of the most highly conserved helices in the predicted PRC2-binding region (D1) of HOTAIR. The secondary structure map of H7 (nucleotides 187–216), base pairs covarying or conserved in Human and Mouse (numbered according to Genebank ID gi: 383286748) are highlighted. The alignment of the sequences of human and mouse HOTAIR is presented, color-coded by residue type. (C) Helix 10 (327–394) is not part of the predicted PRC2-binding region but it is also highly conserved between human and mouse, suggesting that helices that are not of part binding region may also play a role in HOTAIR function (also see Figure S4).
To evaluate PRC2 affinity for D1 we performed gel shift assays, using established protocols. Before testing D1, we first compared the binding of refolded and natively purified HOTAIR fragment 1–300 (F300) to PRC2. We observe that PRC2 binds to natively purified F300 with a Kd similar to that reported in the literature (Cifuentes-Rojas et al., 2014), indicating that the natively purified transcript is functional for binding (Figure S4E). We then tested PRC2 binding to D1, and we observe that PRC2 binds cooperatively to D1 with 2-fold greater affinity than observed for F300 (Figure S4E), possibly because the upstream secondary structural elements are all intact within D1 (two of the helices are broken in F300). Taken together, these results underscore the utility of chemical probing and secondary structure determination in tandem with the characterization of protein-RNA interactions.
DISCUSSION
Genome-wide mapping indicates that lncRNAs contain highly structured regions (Ding et al., 2014; Wan et al., 2012), and demonstrates the folded nature of RNA molecules in their natural milieu. However, many RNAs (such as HOTAIR) are not sufficiently abundant for structure probing in vivo. Even for abundant RNAs, the availability of secondary structure maps for free RNA molecules facilitates the interpretation of maps obtained under more complex conditions. Perhaps just as importantly, numerous in vitro experiments are being routinely performed on HOTAIR (Cifuentes-Rojas et al., 2014; Davidovich et al., 2013; Wu et al., 2013), and these are being conducted without a structural map or information on RNA stabilization. A structural map would help guide these experiments and accelerate research that is focused on the molecular functions of HOTAIR. Quantitative experimentation on RNA folding and structure still depends on analysis of RNA structures in-vitro, particularly if it can be functionally validated through phylogenetics, as in this case. Therefore, despite the power of genome-wide analysis, biophysical studies of lncRNAs in vitro are essential for determination of molecular mechanism. The large size (thousands of nucleotides) of lncRNAs poses significant new challenges for in vitro studies. For example, it is difficult and often impossible to use gel purification. Particular attention is also required for assessing the stability, conformational homogeneity, and sensitivity of individual lncRNAs to the experimental conditions used for in vitro studies.
In this work we obtained a homogeneous, monodisperse form of human HOTAIR using non-denaturing purification. We have adapted techniques from the protein world and purified HOTAIR without any denaturing steps using FPLC, as if purifying a large protein. Further, we employed SEC and SV-AUC to assess the folding requirements and conformational homogeneity of HOTAIR under various conditions. In summary, our observations suggest that it is important to retain “transcriptional memory” in lncRNAs by employing non-denaturing purification, and to subsequently explore the conditions needed for higher-order structural stabilization.
Functional relevance of the HOTAIR secondary structure map
Previous deletion experiments on HOTAIR narrowed its protein interaction sites down to two modular regions: (1) a 300-nucleotide-long region at the 5′-end of HOTAIR (nucleotides 1–300, HOTAIR300), which binds PRC2, and (2) a 646-nucleotide-long region at the 3′-end of HOTAIR, which binds LSD1/CoREST/REST (Tsai et al., 2010). Although it is known that HOTAIR affinity for proteins is high (Cifuentes-Rojas et al., 2014; Davidovich et al., 2013), the binding motifs for PRC2 or LSD1 have yet to be identified and the molecular mechanism of interaction between HOTAIR and its protein partners is unknown. Our structural data now provide a framework to gain additional insights into the motifs involved in these interactions.
First, we observe a good correlation between the predicted protein-binding regions and the modularity of HOTAIR secondary structure. The PRC2-binding fragment F300 is contained within the first structural domain of HOTAIR, at the 5’-end of the molecule (D1), and it shows highly correlated chemical reactivity independent of whether it is expressed in isolation, in the context of D1, or in the context of full-length HOTAIR (Figure 3). This suggests that the secondary structural elements of the PRC2-binding fragment are particularly stable and that they are confined to HOTAIR D1. In addition, the HOTAIR structural map matches very well with predicted protein binding regions at the 3’-end of the molecule. Deletion analysis had defined that the LSD1-binding fragment is confined between nucleotides 1,500 and 2,148 (Tsai et al., 2010), which is in excellent agreement with the boundaries that we observe for D4 (nucleotides 1,514–2,148). It is remarkable that the boundaries of protein-binding motifs are maintained even in the absence of protein partners, and it suggests that functional domains of HOTAIR can fold autonomously, potentially serving a role in subsequent recognition by proteins.
Second, we observe a significant degree of covariation in predicted secondary structural elements that are located not only in protein-binding regions but also outside known protein-binding domains. For instance, H7 (nucleotides 187–216) in D1 is conserved or covariant in many mammalian sequences, including the sequence of mouse HOTAIR, which was also shown to bind PRC2 (Li et al., 2013). Similarly, helices in D2 and D3, such as H26, H27 and H29 are also highly covarying, and these helices coincide with segments that bind HuR and Ataxin-1 (Yoon et al., 2013). Strikingly, the level of covariation suggested by our structure map extends to mouse HOTAIR, which was thought to have diverged from and lost functional similarity to human HOTAIR (Schorderet and Duboule, 2011). Specifically, between human and mouse, H7 possesses 25 conserved residues out of 29 total residues (Figure 4B). Similarly, in H10 there are 57 conserved residues (out of 68 total residues) between human and mouse (Figure 4C). Observed difference in PRC2 binding affinities for human and mouse HOTAIR (Cifuentes-Rojas et al., 2014) may be accounted for by differences in size of loop and bulge regions. Helices absent in mouse HOTAIR (or inserted in the human HOTAIR) could also affect the binding specificity. But there is also the possibility that conserved H7 and H10 are involved in binding factors other than PRC2. Overall, the structural data indicate that evolution has preserved a number of common human and mouse HOTAIR elements, which supports the conclusions of recent functional studies showing that mouse HOTAIR also binds PRC2 and produces a phenotype of homeotic transformation and skeletal malformation when deleted (Li et al., 2013).
Finally, the HOTAIR structural map reveals that, unlike RepA/Xist or roX RNAs, which were proposed to bind proteins through relatively simple tandem stem-loops (Ilik et al., 2013; Zhao et al., 2008), HOTAIR forms an intricate secondary structure within protein binding segments. Certainly, it would be interesting to probe the structure of HOTAIR also in the presence of PRC2 and LSD1/CoREST/REST and to determine at the molecular level the identities of the interacting residues. Regardless of how it is involved with partners, our work shows that preservation of intact HOTAIR structural elements should be a major consideration when designing functional studies in the future.
In conclusion, we report the experimental secondary structure of HOTAIR, determined using a combination of chemical probes. The resulting structural map shows that HOTAIR is composed of four independent architectural modules, two of which correspond precisely to predicted protein-binding domains. The HOTAIR map also reveals secondary structural elements that have been maintained through evolution. This study provides a valuable starting point for navigating the organization of the highly complex HOTAIR molecule and for interpreting its physiological functions. Our experimental approach is likely to be applicable for studying the many other lncRNAs that are expressed in eukaryotic cells.
EXPERIMENTAL PROCEDURES
Non-denaturing purification
Plasmids were linearized with the appropriate restriction enzyme (NEB, high fidelity). After in vitro transcription, the mixture was supplemented with DNase (20 U/ml; Ambion) and incubated for 30 min at 37 °C, and subsequently with proteinase K (0.3 mg/ml; Ambion) and incubated for 45 min at 37 °C. The digested products were filtered by Amicon Ultra-0.5 (Millipore) centrifugal filters (100 kDa molecular weight cut-off), followed by size exclusion chromatography (SEC) to remove large aggregates and prematurely terminated transcripts (see supplemental experimental procedures for details).
Sedimentation velocity analytical ultracentrifugation (SV-AUC)
Sedimentation velocity analytical ultracentrifugation (SV-AUC) experiments were performed using Beckman XL-1 centrifuge with An-60 Ti rotor (Beckman Coulter). All experiments were performed at 20 °C at 25,000 rpm. Data were analyzed using Sedfit (Schuck, 2000).
Chemical probing by SHAPE
For SHAPE, freshly purified HOTAIR (20 pmol) was supplemented with probing buffer (PB: 50 mM K-HEPES pH 7.4, 200 mM KCl, 25 mM MgCl2, 0.1 mM NaEDTA in 1 ml). RNA was incubated at 37 °C for 45 min and divided into two tubes, each containing 490 µl. Reaction was started by addition of 544 nmol 1M7 (+) (1-methyl-7-nitroisatoic anhydride), or of an equal amount of pure DMSO as a control (−). Samples were incubated for 5 min at 37 °C, precipitated with equal amount of isopropanol and washed with 70% ethanol. Reverse transcription was performed using Super Script III reverse transcriptase (Invitrogen). After reverse transcription, (+) and (−) samples were quenched, combined and co-precipitated with 2.5 volumes of 95 % ethanol. Pellets were washed twice with 70 % ethanol, air dried, and dissolved in 50 µl deionized formamide (see supplementary methods for details).
Chemical probing by Tb-cleavage
Terbium-induced cleavage was conducted as for SHAPE, but at room temperature for 1 h. 2.5 mM terbium chloride in 5 mM potassium cacodylate pH 5.5 was used as 10× reaction buffer stock. For control samples, corresponding amount of cacodylate buffer was added.
Chemical probing by DMS
Methylation with dimethyl sulfate (DMS, SigmaAldrich) was conducted as for SHAPE, but 25 mM potassium cacodylate pH 7.0 was used instead of K-HEPES in the probing buffer, and modification reactions were performed at room temperature for 10 min.
Data processing, normalization and error assessment
All capillary data sets were processed using ShapeFinder (Vasa et al., 2008). Probing data were normalized as described before (McGinnis et al., 2009; Watts et al., 2009) with slight modifications (see supplementary methods for details).
Structure determination
To generate the HOTAIR secondary structure maps using the software RNAStructure, SHAPE reactivity was used to provide pseudo-energy constraints (Deigan et al., 2009; Low and Weeks, 2010). Resulting structures were manually evaluated for match with DMS and terbium probing data.
Covariance Analysis
Covariance analysis was performed with Infernal 1.1 (Nawrocki and Eddy, 2013). Sequences and multiple sequence alignments were downloaded from the Ensembl database (Flicek et al., 2014). Covariance in the resulting alignment was calculated using R2R (Weinberg and Breaker, 2011).
Supplementary Material
Highlights.
Natively purified HOTAIR adopts a single, well-defined conformation
HOTAIR has an elaborate and well-defined secondary structure
HOTAIR forms independent structural domains
HOTAIR shows a significant degree of phylogenetic covariation
ACKNOWLEDGEMENTS
The PRC2 complex was a generous gift from Dr. Catherine Cifuentes-Rojas and Dr. Jeannie T. Lee (Massachusetts General Hospital). We acknowledge Dr. Zasha Weinberg for helpful discussion on covariance analysis, and Dr. Erik Jagdmann for the synthesis of 1M7. We thank all other members of the Pyle lab, and in particular Nathan Pirakitikulr, Olga Fedorova, Thayne Dickey for constructive discussion, and Leen Van Besien for technical assistance. This project was supported by the National Institute of Health (RO1GM50313). AMP is an Investigator and IC a Postdoctoral Fellow of the Howard Hughes Medical Institute.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
SUPPLEMENTAL INFORMATION
Supplemental information includes four figures, three tables, and supplemental experimental procedures.
AUTHOR CONTRIBUTIONS
ML, SS, IC, MM, FL and AMP conceived the project; ML, SS, IC, MM and FL conducted experiments; SS, MM, IC, and FL analyzed the data. All authors wrote and reviewed the paper.
REFERENCES
- Batey RT. Advances in methods for native expression and purification of RNA for structural studies. Curr. Opin. Struct. Biol. 2014;26C:1–8. doi: 10.1016/j.sbi.2014.01.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brown PH, Schuck P. Macromolecular size-and-shape distributions by sedimentation velocity analytical ultracentrifugation. Biophys. J. 2006;90:4651–4661. doi: 10.1529/biophysj.106.081372. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cifuentes-Rojas C, Hernandez AJ, Sarma K, Lee JT. Regulatory Interactions between RNA and Polycomb Repressive Complex 2. Mol. Cell. 2014;55:171–185. doi: 10.1016/j.molcel.2014.05.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davidovich C, Zheng L, Goodrich KJ, Cech TR. Promiscuous RNA binding by Polycomb repressive complex 2. Nat. Struct. Mol. Biol. 2013;20:1250–1257. doi: 10.1038/nsmb.2679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Deigan KE, Li TW, Mathews DH, Weeks KM. Accurate SHAPE-directed RNA structure determination. Proc Natl Acad Sci U S A. 2009;106:97–102. doi: 10.1073/pnas.0806929106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S, Tilgner H, Guernec G, Martin D, Merkel A, Knowles DG, et al. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res. 2012;22:1775–1789. doi: 10.1101/gr.132159.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ding Y, Tang Y, Kwok CK, Zhang Y, Bevilacqua PC, Assmann SM. In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features. Nature. 2014;505:696–700. doi: 10.1038/nature12756. [DOI] [PubMed] [Google Scholar]
- Flicek P, Amode MR, Barrell D, Beal K, Billis K, Brent S, Carvalho-Silva D, Clapham P, Coates G, Fitzgerald S, et al. Ensembl 2014. Nucleic Acids Res. 2014;42:D749–D755. doi: 10.1093/nar/gkt1196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gupta RA, Shah N, Wang KC, Kim J, Horlings HM, Wong DJ, Tsai MC, Hung T, Argani P, Rinn JL, et al. Long non-coding RNA HOTAIR reprograms chromatin state to promote cancer metastasis. Nature. 2010;464:1071–1076. doi: 10.1038/nature08975. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ilik IA, Quinn JJ, Georgiev P, Tavares-Cadete F, Maticzka D, Toscano S, Wan Y, Spitale RC, Luscombe N, Backofen R, et al. Tandem stem-loops in roX RNAs act together to mediate X chromosome dosage compensation in Drosophila. Mol. Cell. 2013;51:156–173. doi: 10.1016/j.molcel.2013.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kaushik K, Leonard VE, Kv S, Lalwani MK, Jalali S, Patowary A, Joshi A, Scaria V, Sivasubbu S. Dynamic expression of long non-coding RNAs (lncRNAs) in adult zebrafish. PLoS One. 2013;8:e83616. doi: 10.1371/journal.pone.0083616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim K, Jutooru I, Chadalapaka G, Johnson G, Frank J, Burghardt R, Kim S, Safe S. HOTAIR is a negative prognostic factor and exhibits pro-oncogenic activity in pancreatic cancer. Oncogene. 2013;32:1616–1625. doi: 10.1038/onc.2012.193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li L, Liu B, Wapinski Orly L, Tsai M-C, Qu K, Zhang J, Carlson Jeff C, Lin M, Fang F, Gupta Rajnish A. Targeted Disruption of Hotair Leads to Homeotic Transformation and Gene Derepression. Cell Rep. 2013;5:3–12. doi: 10.1016/j.celrep.2013.09.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Low JT, Weeks KM. SHAPE-directed RNA secondary structure prediction. Methods. 2010;52:150–158. doi: 10.1016/j.ymeth.2010.06.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McGinnis JL, Duncan CD, Weeks KM. High-throughput SHAPE and hydroxyl radical analysis of RNA structure and ribonucleoprotein assembly. Methods Enzymol. 2009;468:67–89. doi: 10.1016/S0076-6879(09)68004-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nawrocki EP, Eddy SR. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics. 2013;29:2933–2935. doi: 10.1093/bioinformatics/btt509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Necsulea A, Soumillon M, Warnefors M, Liechti A, Daish T, Zeller U, Baker JC, Grutzner F, Kaessmann H. The evolution of lncRNA repertoires and expression patterns in tetrapods. Nature. 2014;505:635–640. doi: 10.1038/nature12943. [DOI] [PubMed] [Google Scholar]
- Novikova IV, Dharap A, Hennelly SP, Sanbonmatsu KY. 3S: shotgun secondary structure determination of long non-coding RNAs. Methods. 2013;63:170–177. doi: 10.1016/j.ymeth.2013.07.030. [DOI] [PubMed] [Google Scholar]
- Novikova IV, Hennelly SP, Sanbonmatsu KY. Structural architecture of the human long non-coding RNA, steroid receptor RNA activator. Nucleic Acids Res. 2012;40:5034–5051. doi: 10.1093/nar/gks071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pennisi E. Cell biology. Lengthy RNAs earn respect as cellular players. Science. 2014;344:1072. doi: 10.1126/science.344.6188.1072. [DOI] [PubMed] [Google Scholar]
- Rinn JL, Kertesz M, Wang JK, Squazzo SL, Xu X, Brugmann SA, Goodnough LH, Helms JA, Farnham PJ, Segal E, et al. Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell. 2007;129:1311–1323. doi: 10.1016/j.cell.2007.05.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sauvageau M, Goff LA, Lodato S, Bonev B, Groff AF, Gerhardinger C, Sanchez-Gomez DB, Hacisuleyman E, Li E, Spence M, et al. Multiple knockout mouse models reveal lincRNAs are required for life and brain development. Elife. 2013;2:e01749. doi: 10.7554/eLife.01749. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schorderet P, Duboule D. Structural and functional differences in the long non-coding RNA hotair in mouse and human. PLoS Genet. 2011;7:e1002071. doi: 10.1371/journal.pgen.1002071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schuck P. Size-distribution analysis of macromolecules by sedimentation velocity ultracentrifugation and lamm equation modeling. Biophys. J. 2000;78:1606–1619. doi: 10.1016/S0006-3495(00)76713-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sosnick TR, Pan T. RNA folding: models and perspectives. Curr. Opin. Struct. Biol. 2003;13:309–316. doi: 10.1016/s0959-440x(03)00066-6. [DOI] [PubMed] [Google Scholar]
- Sparmann A, van Lohuizen M. Polycomb silencers control cell fate, development and cancer. Nat. Rev. Cancer. 2006;6:846–856. doi: 10.1038/nrc1991. [DOI] [PubMed] [Google Scholar]
- Su LJ, Brenowitz M, Pyle AM. An alternative route for the folding of large RNAs: apparent two-state folding by a group II intron ribozyme. J. Mol. Biol. 2003;334:639–652. doi: 10.1016/j.jmb.2003.09.071. [DOI] [PubMed] [Google Scholar]
- Toor N, Keating KS, Taylor SD, Pyle AM. Crystal structure of a self-spliced group II intron. Science. 2008;320:77–82. doi: 10.1126/science.1153803. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsai MC, Manor O, Wan Y, Mosammaparast N, Wang JK, Lan F, Shi Y, Segal E, Chang HY. Long noncoding RNA as modular scaffold of histone modification complexes. Science. 2010;329:689–693. doi: 10.1126/science.1192002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vasa SM, Guex N, Wilkinson KA, Weeks KM, Giddings MC. ShapeFinder: a software system for high-throughput quantitative analysis of nucleic acid reactivity information resolved by capillary electrophoresis. RNA. 2008;14:1979–1990. doi: 10.1261/rna.1166808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Volders PJ, Helsens K, Wang X, Menten B, Martens L, Gevaert K, Vandesompele J, Mestdagh P. LNCipedia: a database for annotated human lncRNA transcript sequences and structures. Nucleic Acids Res. 2013;41:D246–D251. doi: 10.1093/nar/gks915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wan Y, Qu K, Ouyang Z, Kertesz M, Li J, Tibshirani R, Makino DL, Nutter RC, Segal E, Chang HY. Genome-wide measurement of RNA folding energies. Mol. Cell. 2012;48:169–181. doi: 10.1016/j.molcel.2012.08.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wapinski O, Chang HY. Long noncoding RNAs and human disease. Trends Cell Biol. 2011;21:354–361. doi: 10.1016/j.tcb.2011.04.001. [DOI] [PubMed] [Google Scholar]
- Watts JM, Dang KK, Gorelick RJ, Leonard CW, Bess JW, Jr, Swanstrom R, Burch CL, Weeks KM. Architecture and secondary structure of an entire HIV-1 RNA genome. Nature. 2009;460:711–716. doi: 10.1038/nature08237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weinberg Z, Breaker RR. R2R--software to speed the depiction of aesthetic consensus RNA secondary structures. BMC Bioinformatics. 2011;12:3. doi: 10.1186/1471-2105-12-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu L, Murat P, Matak-Vinkovic D, Murrell A, Balasubramanian S. Binding interactions between long noncoding RNA HOTAIR and PRC2 proteins. Biochemistry. 2013;52:9519–9527. doi: 10.1021/bi401085h. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yoon JH, Abdelmohsen K, Kim J, Yang X, Martindale JL, Tominaga-Yamanaka K, White EJ, Orjalo AV, Rinn JL, Kreft SG, et al. Scaffold function of long non-coding RNA HOTAIR in protein ubiquitination. Nat. Commun. 2013;4:2939. doi: 10.1038/ncomms3939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhao J, Sun BK, Erwin JA, Song JJ, Lee JT. Polycomb proteins targeted by a short repeat RNA to the mouse X chromosome. Science. 2008;322:750–756. doi: 10.1126/science.1163045. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.